IAB Statement on Identifiers and Unicode 7.0.0

Shawn Steele Shawn.Steele at microsoft.com
Wed Jan 28 23:36:30 CET 2015

> So it is not merely that these are visually the same, but that they're in the same script, and that other analagous cases get treated as canonically equivalent and these don't.  That's a problem for the IETF because we were using the derived properties and the canonical equivalence under the assumption that they'd give us certain guarantees, and it turns out they don't.  Again, this isn't Unicode's fault, it's just a fact of the way things are.

I'm more than a little confused why we're picking on this character rather than other more likely characters.  I'm not at all sure why "same script" matters (except that it's a little bit easier to question mixed script stuff), and don't understand the bar for "visually the same within the same script".

This character is getting picked on, but l & I are dismissed, though it was pointed out that some fonts render them identicaIIy.  So when are two characters "visually the same"?  Does it matter which font is used?  Does it matter who's doing the looking?  Does the font size or resolution of the display matter?


More information about the Idna-update mailing list