Standardizing on IDNA 2003 in the URL Standard

Shawn Steele Shawn.Steele at microsoft.com
Wed Aug 21 19:05:21 CEST 2013


IMO, the eszett & even more so, final sigma, are somewhat display issues.  My personal opinion is we need a display standard (yes, that's not easy).

A non-final sigma isn't (my understanding) a valid form of the word, so you shouldn't ever have both registered.  It could certainly be argued that 2003 shouldn't have done this mapping.  If these are truly mutually exclusive, then the biggest problem with 2003 isn't a confusing canonical form, but rather that it doesn't look right in the 2003 canonical form.  However there's no guarantee in DNS that I can have a perfect canonical form for my label.  Microsoft for example, is a proper noun, however any browser nowadays is going to display microsoft.com, not Microsoft.com.  (Yes, that's probably not "as bad" as the final sigma example).

Eszett is less clear, because using eszett or ss influences the pronunciation (at least in Germany, in Switzerland that can be different).  I imagine it's rather worse if you're Turkish and prefer different i's.  For German, nobody is ever going to expect fußball.ch and fussball.ch to go different place.  And nobody's going to be surprised if fußball.de and fussball.de end up at the same site.  (On the contrary, they'd probably be surprised otherwise).  IMO, this is kind of like dove.com (a bird site) vs dove.com (a swimming site), they have different pronunciations.  

For words that happen to be similar, there's no expectation that a DNS name is available.  AAA Plumbing and all the other AAA whatever's out there aren't going to be surprised that AAA.com is already taken.  So why's German more special that Turkish or English?  And particularly at the expense of spoofability?

I'd much prefer a mechanism to suggest a preferred display form.  That'd solve things like the Turkish I issue as well.

-Shawn




More information about the Idna-update mailing list