Eszett

John C Klensin klensin at jck.com
Wed Aug 13 22:32:28 CEST 2014



--On Wednesday, August 13, 2014 17:27 +0000 Shawn Steele
<Shawn.Steele at microsoft.com> wrote:

>> Exactly, but a there's a somewhat popular
>> fussball.de<http://fussball.de> site.  Users probably expect
>> the sharp-s version to take them there.
> 
>> Which it does when I try it. If others get a different result
>> this whole thing is really screwed.
> 
> Well, that's likely because the browser developers are
> afraid of what the IDNA2008 change would do to that scenario
> and have implemented the transitional part of UTR#46.  If you
> followed pure IDNA2008, you could end up somewhere else.  I
> shouldn't have randomized the other conversation with this.
> However, IMO, the eszett and ss being unique or different
> aren't really a problem.  (Changing the behavior is).  There
> are numerous ways to spell lots of things in lots of
> languages, so at some point registrars need to be smart and
> bundle (or block).  &/or browser manufacturers need to take
> other steps to prevent phishing … blacklists, certificates,
> etc… -Shawn

Shawn, the difficulty here is that different people have
different expectations.   To reuse one of our oldest example, I
expect that, if one label is "colour" and another one is
"color", they will match.  I learned about that little spelling
idiosyncrasy as a small child, have gotten used to it, and have
high expectations.  Yet the DNS (and other exact-match systems)
have never allowed those strings to match.  If I'm German and
used to conventions that are commonly used in German typography
to avoid decorated Latin script characters, I expect "nür" to
match "nuer" but I get nervous when people expect "nuer" to
match "nür" because the latter suggests that "Goethe" ought to
match "Göthe" and that would be a rather serious spelling
error.  Should "Göthe" be blocked because some stupid algorithm
might confuse it with the poet?  I don't know, but I'm pretty
sure I want that decision to be made by Germans (in their
registration system if appropriate, but note the several earlier
comments about the protocols that use unmanaged names and/or
server-side matching).

The same issue applies with "ß" (Eszell, Sharp-S).  "ß" can
more or less safely be mapped to "ss" (although I understand the
current official German orthography says that one should not do
so) but there are words used in German that contain "ss" that
don't map back to "ss".

Following the lead in one of Andrew's notes and a personal need
to get back to some other things, I'm going to drop out of this
conversation for a while.  

One observation before I do: several of your notes seem to
convey the impression that the browser designers/vendors are the
final authority on these matters.  There is history in various
quarters that suggests that national cultural and
competitiveness authorities have sometimes developed different
opinions on those matters and on collections of vendors getting
together without public input to figure out what should be
permitted and what shouldn't.    If I were giving advice --and
I'm definitely not-- I'd try to avoid decisions to change one
string, or distinct national character into another, especially
when such decisions were based on closed-group decisions.

     john



More information about the Idna-update mailing list