Unicode & IETF
Shawn.Steele at microsoft.com
Tue Aug 12 00:30:56 CEST 2014
> I still owe you a response to an earlier note. I will try to get back to that soon but, in the interim, let me add a bit to Patrik's response below.
Ok, I sort of wish I wasn't getting responses though :)
> Completely consistent behavior would be wonderful. Beyond joining Patrik in saying "I'd like a pony too", I would note that there was an idea in the programming language standards committee many years ago that the right way to do a universal character set would involve one code point per character, no combining sequences, and, ideally, no script divisions.
I'm laughing, and yes though it may be naïve in hindsight, I sort of feel like that's what's being requested here: A mathematical certainty that such-and-such character is used only in so-and-so manner.
> It makes a big difference in that respect whether you (or I or
Patrik) need to get it right once or to keep juggling. IDNA2008 is based on two hypotheses: that we understood where the peculiarities are looking backwards and have dealt effectively with them and that things will be stable
Which comes back to my earlier question: What is IDNA trying to do? I'm a bit confused by that. I see two things:
A) Enable International Domain Names, which it certainly does, and
B) Try to reduce the confusables, which, IMO, is something registrars can address. Also, IMO, this is a 'nice-to-have' because the scenarios I've heard are around phishing where this is a very minor part of an attack. (I could be missing scenarios).
I feel like I'm hearing a third: Make linguistic domain names mathematically unique so that I can depend that X==Y and maybe map back and forth between them with 100% certainty, particularly if they are rendered the same. I don't feel that IDNA2003 or IDNA2008 accomplished that and don't see the class of behavior being discussed as contributing to the problem.
I'd be open to discussing how to use Unicode successfully in environments where such precision is necessary. Though I'm skeptical of the wisdom of mapping a user name, I can see perhaps some use in certificates or the like. However I don't think that IDN needs nearly that level of certainty. You can't use a name for phishing if the registrar doesn't have it, so I see all of these issues as being something to go into guidelines for registrars.
More information about the Idna-update