Eszett

Mark Davis ⌛ mark at macchiato.com
Mon Jul 20 06:35:05 CEST 2009


My apologies, I gave the wrong examples. Here are the corrected ones:

If we were starting with a blank slate, it would be feasible to have ß/ẞ
kept separate from ss/SS (although note that the German national standards
body doesn't: the uppercase of ß is SS, which is why they got connected in
the first place in Unicode). Similarly, it would have been possible to
separate σ/ς (again, joined because they have a common uppercase Σ).

*But we are not starting with a clean slate: *we are facing with changes to
an existing widely-deployed standard, over 6 years old, which is a long time
in terms of the web. We have two effective options:
 A. Maintain compatibility with IDNA2003  * URLs* * Result
* *When*  http://www.γιατρός.gr <http://www.xn--mxads7ake1d.gr>
http://www.γιατρόσ.gr <http://www.xn--mxads7ake1d.gr>
http://www.ΓΙΑΤΡΌΣ.gr <http://www.xn--mxads7ake1d.gr>
 http://www.xn--mxads7ake1d.gr<http://www.%CE%B3%CE%B9%CE%B1%CF%84%CF%81%CF%8C%CF%83.gr/>
*always*  http://www.weltfußball.at <http://www.weltfussball.at>
http://www.WELTFUẞBALL.at
http://www.weltfussball.at
http://www.WELTFUSSBALL.at
http://www.weltfussball.at *always*

 B. Deviate from IDNA2003Try to separate them, leaving users with a de-facto
indeterminant mapping. That results in the following:

  * URLs* * Result
* *When*  http://www.γιατρός.gr <http://www.xn--mxads7ake1d.gr>
http://www.xn--mxads7afk1d.gr *sometimes*
http://www.xn--mxads7ake1d.gr<http://www.%CE%B3%CE%B9%CE%B1%CF%84%CF%81%CF%8C%CF%83.gr/>
*sometimes*
  http://www.γιατρόσ.gr <http://www.xn--mxads7ake1d.gr>
http://www.ΓΙΑΤΡΌΣ.gr <http://www.xn--mxads7ake1d.gr>
 http://www.xn--mxads7ake1d.gr<http://www.%CE%B3%CE%B9%CE%B1%CF%84%CF%81%CF%8C%CF%83.gr/>
*always*
  http://www.weltfußball.at <http://www.weltfussball.at>
http://www.WELTFUẞBALL.at
http://www.xn--weltfuball-b4a.at *sometimes*  http://www.weltfussball.at *
sometimes*  http://www.weltfussball.at
http://www.WELTFUSSBALL.at
 http://www.weltfussball.at *always*
 Why would this happen? Well, for some indefinite period time, both IDNA2003
and IDNA2008 clients will exist. So you get on a friend's machine, go to
your bank site, and get to a spoof site. Moreover, in an effort to maintain
compatibility for clients, most client-software will do a dual lookup; first
try one then the other. If someone comes in with an intervening
registration, for a spoof site, then a URL that used to work for you now
leads to the spoof site.

Now perhaps the NICs for de, at, and ch will address ß/ẞ/ss/SS (and the NIC
for gr will address σ/ς/Σ) by bundling or blocking. But bundling or blocking
defeats the purpose of separating them, and there is little reason to think
that .com, .biz, .whatever will all do the same -- not to speak of the many,
many more registries below the top level.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090719/f1ffe8f3/attachment.htm 


More information about the Idna-update mailing list