Idna-update Digest, Vol 27, Issue 46
Shawn Steele (???)
Shawn.Steele at microsoft.com
Fri Mar 13 02:19:24 CET 2009
> I think I understand why you would want browsers to remain backward
> compatible with IDNA2003 (even though MSIE7 is incompatible with MSIE6
> because MSIE7 allows non-ASCII domain names while MSIE6 doesn't), but
> if we were to keep the mappings for Eszett, Final Sigma, ZWJ and ZWNJ,
> presumably we wouldn't be able to /add/ mappings for Greek tonos
> either (for the same compatibility reasons).
I'm not saying that we can't change from IDNA2003. Mostly if the clients are going to fall back for illegal names anyway, there's no point to making more illegal names (if they were legal in IDNA2003, then they'll still work).
For Final Sigma, Eszett, ZWJ and ZWNJ I think some sort of display hint mechanism is required, otherwise you could end up somewhere else from where you did in IDNA 2003. Fortunately if you have a "correctly" spelled Unicode link/URL this already works in IDNA2003 (you already get to DNS record for the mapped name).
For German I'm reasonably happy that there isn't much value to supporting both eszett and ss. Same thing with the ZWJ/ZWNJ, I don't think there need to be distinct names with multiple decorations of ZWJ, but I can see that they're required for display. My understanding of Final Sigma is that it could also be solved by a display solution, even though the solution might be a tad odd linguistically.
The Tonos problem is "harder" since it would require adding mappings that didn't exist. Even worse, it would mean that a legal IDNA2003 name would point to a different IDNA2008 name. With the proposed removal of mappings, at least the name being changed would be "new" so there'd be an opportunity to acquire the new name, or to fall back to IDNA2003 if the name was missing completely. For Tonos the new mapping could already be assigned.
Tonos is somewhat mitigated (in my mind) by the Greek registrar being somewhat diligent about the mappings under IDNA2003, although that covers only one level and only .gr. The difference could hopefully be covered by bundling. The Tonos seems pretty broken, though it should be weighed against back-compat. I'll have to give that part of the problem more thought.
For numbers, as per my first point, I don't think it makes sense to try to disallow them now, since clients can just try IDNA2003. I also don't find the rendering/homograph argument compelling either.
BTW: I don't think you'd need xd-- for a display name (and that doesn't help make a distinction at the U-label level). It'd be obvious that an xn-- name was a display name if it mapped to a different xn-- name when the rules were applied.
More information about the Idna-update