Prohibiting mapping of PVALID characters

Kenneth Whistler kenw at sybase.com
Thu Dec 10 00:25:15 CET 2009


> At 2:26 PM -0800 12/9/09, Kenneth Whistler wrote:
> >The other problem with this is that it might cause implementers'
> >heads to explode when they realize that normalization to NFC
> >is required (MUST)
> 
> Uh, oh. Where did you get that idea?

Protocol, 5.2:

5.2. Conversion to Unicode

The string is converted from the local character set into
Unicode, if it is not already in Unicode. Depending on local
needs, this conversion may involve mapping some characters
into other characters as well as coding conversions...
The results MUST be a Unicode string in NFC form.


Strings don't magically get to be "in NFC form", without
being mapped (via normalization algorithm) from whatever form
they started out as, *into* NFC form.

The protocol doesn't state that the Label String Input
has to start out as NFC -- in fact the allowance for
character set conversion from other local character
sets into Unicode would actually preclude that. So
as I see it, any implementation of the lookup for IDNA 2008
is going to have to do NFC normalization as a matter
of course. And that is a *mapping*.

--Ken



More information about the Idna-update mailing list