AW: AW: AW: sharp s (Eszett)
John C Klensin
klensin at jck.com
Tue Mar 18 00:36:35 CET 2008
--On Tuesday, 11 March, 2008 18:13 -0700 Kenneth Whistler
<kenw at sybase.com> wrote:
> There is a German bedding and furniture company called
> Maßlos. They have a website at www.masslos.de. Right now,
> if you type "masslos" or "MASSLOS" or "maßlos" or "MAßLOS"
> into almost any browser, those strings will all end up taking
> you to the intended place, i.e. www.masslos.de.
Yes. And because of how IDNA2003 works, nothing can talk you to
"maßlos.de" because, regardless of what can be typed, that
string (or, more specifically, its ACE equivalent) cannot be
registered. Note that, even if one uses Maßlos.DE to get to
it, the result is not an IDN (its DNS form is the ASCII
"masslos.de", not a punycode-encoded ACE) which is the source of
even more confusion.
> If IDNAbis introduces a forced distinction for ß from ss,
> then any application adapting IDN's will resolve *differently*
> for "maßlos" and "masslos", and the "maßlos" string will
> end up *not* resolving to the domain currently owned by
> Maßlos. This creates trouble for them, since they may end
> up having to acquire a new domain and redirecting. And/or it
> may create trouble for German registries, which would have
> to deal with a new situation. It general it seems to me it
> would be a world of hurt for any existing German domain
> name holder with an interest in ß.
Absolutely. This gives the registry some choices:
(i) They can decide to prohibit registration of
"maßlos", either by prohibiting that particular string
(because "masslos" is already registered) or by
prohibiting any registration containing "ß". A user
who type "Maßlos.de" into an updated application will
therefore get a "not found" error and will then either
have to guess at "masslos.de" or will be off in search
of a search engine or other aid.
(ii) They can create a variant model about registrations
of both "masslos" and "Maßlos" or can create a
"sunrise" model for the existing registrant of
"masslos". For better or worse, we have a lot of
experience with both of those options.
Neither of those options is exactly earth-shaking, although, as
several of us have suggested, input from the relevant registries
would be a useful part of our discussions.
Indeed, since "masslos" is an ASCII LDH label, had "ß" been
treated as a distinct character in IDNA2003, the registry would
have been faced with the same sunrise and/or registry
prohibition then. This is really a situation no different from
the issues the same registry would face if they had a
registration for "narrisch". Now, that is clearly a
misspelling, but few registries I know of prevent registration
of misspellings (even in the hope of being more
phishing-resistant). So, someone comes along and wants to
register "naerrisch". Should that be prohibited? Or,
variant-like, restricted to the same registrant? Noting that
both of those forms are ASCII LDH names, is the correct
spelling, närrisch, really any different?
The point here is that registries have to make policies about
these things today, even in the ASCII-only space. Their need to
make such policies for the relationships between IDN forms and
(local) ASCII typing convention-variations exists as much in
IDNA2003 (even with its mapping rules) as it does in the
proposed IDNA200X. And, from that standpoint, the Eszett debate
doesn't really change anything.
In addition, we've discovered that different registries have
made different rules in these areas, both when IDNs were
introduced and on an ongoing basis. Maybe that is ok.
Actually, it had better be ok because we have no practical way
to make and enforce uniform rules at this level.
> And I think that is why Mark wants to make sure that we
> have DE-NIC and all the other German stakeholders explicitly
> on board for any change that would potentially have these
> kinds of impacts for German domain names.
I'd be reluctant to make a change without finding out what
registries for country-code domains associated with countries
with large German-speaking populations have to say about it. At
the same time, what we've found out repeatedly is that we
promptly get out onto thin ice as soon as someone starts
figuring out what constitutes a "stakeholder" and then starts
counting them. Some of us still believe that registries ought
to be service operations and, in the language of RFC 1591,
trustees for the Internet community. If one believes that
model, the stakeholders include the users and registrants, not
just the registries or registrars.
More information about the Idna-update