Unicode 7.0.0, (combining) Hamza Above, and normalization

Shawn Steele Shawn.Steele at microsoft.com
Fri Aug 8 00:18:03 CEST 2014


Regardless of the status of any individual character, "we", the IDNA WG, messed up character mappings in IDNA2003.  IDNA2008 recognized that and we decided to leverage the mappings defined by Unicode, which has far more expertise dealing with how languages are encoded.  In my opinion that cleaned up a big mess and was a huge win.

I really don't want to get into a pattern of second guessing and micromanaging Unicode's handling of code points.  That fractures character handling across computing and historically has led to many not-good things, like code pages.  It makes it confusing for implementers and adopters and eats up our time on this list.

Registrars already have mitigations for homographs, and regardless of whether or not this character 'should' exist or should've been done differently, certainly the existing processes can remove any confusion.

-Shawn





More information about the Idna-update mailing list