Eszett

Vint Cerf vint at google.com
Sun Jul 12 00:20:51 CEST 2009


Shawn et al,

long ago the working group decided that eszett "ß" would be a PVALID  
character and thus allowed for registration and lookup.

I see no utility in re-visiting that now.

What is at issue is what kind of pre-lookup mapping(s) are going to be  
drawn to implementor attention.

The draft mappings-01 provides an example but does not speak to the  
RFC2119 language of MUST, SHOULD, or MAY.

I think the focus of the discussion should be whether we can refine  
mappings-01 language to the point that we can include such  
recommendations.

My sense is that some of us believe that to achieve this goal, more  
needs to be said about under what conditions these mappings should be  
applied or could be ignored.
One possible approach is to add language explaining in the document  
what the side-effects of ignoring or using the proposed pre-lookup  
mappings might be.

I also believe that the WG has demonstrated a belief that some mapping  
always takes place as strings containing non-LDH characters are  
obtained from users (one way or another) and before these strings are  
in a form suitable for direct lookup in the DNS.

Our problem seems to be how to characterize when such mappings may  
take place and what they are. For example, I think it is clear that  
any string that can be looked up must be in NFC form, and if it does  
not start out that way, it has to be transformed before the DNS can be  
used.

Could we try to focus on this last bit so we can finalize the documents?

vint




On Jul 11, 2009, at 5:58 PM, Shawn Steele wrote:

> I'm definitatly an individual now :)
>
>> What do you mean by 99% linguistically equivalent? In the new German
>> orthography, the difference between ß and ss is a very clear phonetic
>> difference (long preceeding vowel for ß, short preceeding vowel for
>> ss).
>
> I agree completely that eszett is a letter and that the "new  
> orthography" give clear and concise rules for when ss and ß are  
> supposed to be used in German, in Germany since 1996.
>
> In practice, how is fussball.de pronounced?  Of course that site  
> likely picked ss instead of the correct spelling because ASCII  
> didn't allow eszett.  I think most Germans would recognize ss  
> instead of ß as a widely recognized alternate spelling of words  
> using eszett.
>
> IDN is attempting to support International domain names, hopefully  
> consistently.
>
> * Eszett has a long history of alternately being spelled with ss.   
> You even used to see "strasse" on some German street signs (yes,  
> spelling reform has caused signs to be reprinted)
>
> * What happens to a business with a pre-1996 spelling of its name?
>
> * The IDN standard is, I believe, International.  A swiss user can  
> type fussball.com and a german user type fußball.com.  Are we really  
> taking the stand that the same word, in the same language, should go  
> to a different place just because someone might confuse masse with  
> maße?  We never promised all words would be registered.  I can't  
> register "shift.com" for garmets, "shift.com" for a worker's  
> overtime rights group, and "shift.com" for a moving organization.
>
> * Assuming that I want a german domain name, am I truely going to  
> limit myself to the ß spelling?  I cannot imagine, whether I was  
> masse.de or maße.de, not registering both names.  If nothing else, I  
> may have swiss customers.
>
> * If everyone is going to "bundle" the names anyway, then why break  
> 2003 just to force them to bundle.  Sure, the registrars may not  
> prebundle the names, but as a user I certainly would.
>
> * What happens to the existing ss names that should be spelled ß?   
> What happens when someone else beats fussball.de to fußball.de?  Are  
> we going to force the registrars to pre-bundle or block those URLs?
>
> Some of those points must be new points, or at least ignored  
> points.  If nothing else the last question "how do we expect  
> existing fußball mappings to be migrated" hasn't been answered that  
> I'm aware of.
>
> The goal of name services is to provide a pointer to a server.   
> IDNA2003 already allows that with ß.  Maße.de goes to a server.  It  
> is nearly irrelevent that masse.de also goes to the same server  
> since AAA.com for the auto club and AAA.com for my local plumbers  
> also would go to the same server.  Same with "shift" or any English  
> homograph.  It cannot possibly be a requirement that everyone get  
> their perfect name.
>
> Assuming that ß were allowed and someone wants to register  
> "Maße.de", what would happen if it were already taken?  Worst case  
> they'd say "bummer, I need a different name" and try again with  
> something else.  Is that so bad?  It happens thousands of times a  
> day in ASCII.  At this point it is pretty irrelevent whether they  
> were blocked by masse.de or an existing maße.de.
>
> The only problem I see with ß and IDNA2003 is that if I do a reverse  
> DNS lookup on the name, then I end up with the non-standard (unless  
> I'm swiss) spelling.  Still common enough that users can easily read  
> it, but non-standard enough that spelling teachers would likely  
> frown.  I am not at all sure that solving that reverse lookup/ 
> display problem is worth breaking IDNA2003 and confusing Swiss- 
> German / German-German interoperability.
>
> IDNA2003 "freaked out" with a minor bug in the Unicode normalization  
> standard that caused a minor ambiguacy in character sequences that  
> weren't even linguistically possible, in any language.  The response  
> was to get Unicode to create Corrigendum #5 for UAX#15 because the  
> breaking behavior was intolerable.  Yet breaking the behavior of  
> IDNA2003 for an existing sequence that is commonly used (fußball.de)  
> and valuable, at least to some users, is acceptable?
>
> So, specific, new questions:
> * Do we believe that users (or registrars) are NOT going to bundle  
> these names anyway when getting new names?
> * How is migration going to happen for users like fussball.de?
> * Do we feel that there are going to be no Swiss-German/German- 
> German interoperability issues because of this change?
>
> -Shawn
>
> Note: There was a question regarding ss and ß in comparisons.  Both  
> Google and bing seem to return results for either spelling.  (I  
> think the compare equal there, or else I was just getting lucky and  
> keywords are added in both forms).
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list