idna-mapping update

Kenneth Whistler kenw at sybase.com
Tue Dec 22 02:55:04 CET 2009


Vint,

> the mappings document was deliberately not normative. It outlined a  
> set of mappings that the authors and others saw as meeting user  
> expectations created in part by the case insensitivity of the ASCII  
> DNS, for example. It does not promote the more extensive mapping found  
> in TR46. Despite the backward compatibility arguments, at least some  
> of us worry that the TR46 mappings may tend to create defacto "PVALID"  
> behaviors in IDN domain name labels by characters that aren't, in fact  
> PVALID. 

But that is *already* the case for every uppercase Latin, Greek,
and Cyrillic letter in Unicode. Those are all *DISALLOWED* in
the tables document in IDNA 2008. And the whole point of
talking about case insensitivity in the mapping document and
suggesting that mapping to lowercase might be an interesting
thing to do before resolving by IDNA 2008 is to treat them
as if they were defacto "PVALID" in IDN domain name labels.
Because that is what everybody expects, and it would be
crazy not to.

The same thing applies to the width variants already discussed
in the mappings document, except that those are compatibility
width variants, rather than case pairs. And once again, they
are *DISALLOWED* by the protocol, but mapping them as suggested
in the mappings document creates defacto "PVALID" behaviors
in IDN domain name labels for them.

So the only way I can interpret what you are saying here is that
somehow the group that produced the mappings document seems to
think that there are some "good" mappings (for case pairs
and width variants) that are suggested to produce defacto
"PVALID" behaviors, and there are other "bad" mappings from
IDNA 2003 that shouldn't. But it isn't explicit enough about
that distinction, and in any case I don't think there was
ever really consensus about that point. Instead, there was
just a general weariness and a sense that if the document
wasn't normative, who cares what the recommendation in it
was, so why fight about it?

> At least that's how I have been interpreting the distinction.  
> They also create a kind of conflict with the canonical A-Label/U-label  
> commutation of IDN2008.

How does having a recommendation to do half of the mappings
like IDNA 2003 and not do the other half of the mappings
like IDNA 2003 not also create that conflict?

The way I interpret Michel's suggestion is that *if* the
mapping document is going to make any suggestions at all
about what might be useful things to map before resolving
an input string by the IDNA 2008 protocol, then it is
way better to make that suggestion be precise and as close
to compatibility with IDNA 2003 as possible, since that
is quite likely what all the major browser implementers
will do anyway.

Otherwise, just junk the whole suggestion section in the
mapping document and wave over at TR46 for a suggestion
instead.

I'd be fine with either approach. But a halfway approach
that is vague about what mappings might be useful, based
on some notion that this is what end users might expect
is less than helpful. What matters here is what the
implementers of the IDNA libraries and the browsers and
indexers expect about backwards compatibility with
IDNA 2003 mappings, IMO.

--Ken




More information about the Idna-update mailing list