Eszett

Vint Cerf vint at google.com
Sun Jul 12 07:00:22 CEST 2009


Mark,

this working group has been unable to make progress in part because  
people keep re-visiting matters that the working group agreed earlier  
was a consensus.

At this point, it's my belief and opinion that we should be focused  
solely on mappings-01 and any changes needed to the other documents to  
accommodate the incorporation of mappings-0x into the ensemble of  
papers that make up IDNA2008.

the proposition to specify display practices by mapping on output is  
definitely outside the scope of the protocol and the charter of the  
working group as I read it.

The rule that no PVALID character should be mapped into another PVALID  
character seems like a good rule to follow here so that would apply to  
upper and lower case eszett and final sigma, I think.

Your recollection of DENIC's preference and mine seem to be at odds  
since they appear to me to have said they want eszett to be PVALID and  
not mapped.

vint




On Jul 11, 2009, at 6:57 PM, Mark Davis ⌛ wrote:

> Vint,
>
> I do think it is well worth revisiting the es-zett issue. At the  
> time the consensus was polled, the framework was substantially  
> different because we had no lookup mappings. It was made clear by  
> DENIC, for example, that they wanted es-zett to be in IDNs (everyone  
> does), but preferred having es-zett mapped to "ss" for compatibility  
> if we had lookup mappings.
>
> Having hrefs with es-zett link to different sites depending on the  
> browser's version of IDNA is a significant compatibility,  
> interoperability, and security issue. Ignoring that is likely to  
> represent a major impediment to adoption of IDNA2008.
>
> We could get away with changing the lookup remapping for ZWJ/NJ  
> because they are in so limited usage currently; that is not the case  
> for es-zett or final sigma. And for final sigma, 98% of the value  
> could be accomplished without introducing indeterminate lookup  
> mappings. That is, by mapping as in IDNA2003 on lookup and adding:  
> "When displaying IDNAs, regular sigmas that are final (not followed  
> by letters) SHOULD be converted to final-sigma".
>
> I realize that we are all heartily tired of these topics, but we  
> don't want IDNA2008 to be DOA either.
>
> Mark
>
>
> On Sat, Jul 11, 2009 at 15:20, Vint Cerf <vint at google.com> wrote:
> Shawn et al,
>
> long ago the working group decided that eszett "ß" would be a PVALID
> character and thus allowed for registration and lookup.
>
> I see no utility in re-visiting that now.
>
> What is at issue is what kind of pre-lookup mapping(s) are going to be
> drawn to implementor attention.
>
> The draft mappings-01 provides an example but does not speak to the
> RFC2119 language of MUST, SHOULD, or MAY.
>
> I think the focus of the discussion should be whether we can refine
> mappings-01 language to the point that we can include such
> recommendations.
>
> My sense is that some of us believe that to achieve this goal, more
> needs to be said about under what conditions these mappings should be
> applied or could be ignored.
> One possible approach is to add language explaining in the document
> what the side-effects of ignoring or using the proposed pre-lookup
> mappings might be.
>
> I also believe that the WG has demonstrated a belief that some mapping
> always takes place as strings containing non-LDH characters are
> obtained from users (one way or another) and before these strings are
> in a form suitable for direct lookup in the DNS.
>
> Our problem seems to be how to characterize when such mappings may
> take place and what they are. For example, I think it is clear that
> any string that can be looked up must be in NFC form, and if it does
> not start out that way, it has to be transformed before the DNS can be
> used.
>
> Could we try to focus on this last bit so we can finalize the  
> documents?
>
> vint
>
>
>
>
> On Jul 11, 2009, at 5:58 PM, Shawn Steele wrote:
>
> > I'm definitatly an individual now :)
> >
> >> What do you mean by 99% linguistically equivalent? In the new  
> German
> >> orthography, the difference between ß and ss is a very clear  
> phonetic
> >> difference (long preceeding vowel for ß, short preceeding vowel  
> for
> >> ss).
> >
> > I agree completely that eszett is a letter and that the "new
> > orthography" give clear and concise rules for when ss and ß are
> > supposed to be used in German, in Germany since 1996.
> >
> > In practice, how is fussball.de pronounced?  Of course that site
> > likely picked ss instead of the correct spelling because ASCII
> > didn't allow eszett.  I think most Germans would recognize ss
> > instead of ß as a widely recognized alternate spelling of words
> > using eszett.
> >
> > IDN is attempting to support International domain names, hopefully
> > consistently.
> >
> > * Eszett has a long history of alternately being spelled with ss.
> > You even used to see "strasse" on some German street signs (yes,
> > spelling reform has caused signs to be reprinted)
> >
> > * What happens to a business with a pre-1996 spelling of its name?
> >
> > * The IDN standard is, I believe, International.  A swiss user can
> > type fussball.com and a german user type fußball.com.  Are we  
> really
> > taking the stand that the same word, in the same language, should go
> > to a different place just because someone might confuse masse with
> > maße?  We never promised all words would be registered.  I can't
> > register "shift.com" for garmets, "shift.com" for a worker's
> > overtime rights group, and "shift.com" for a moving organization.
> >
> > * Assuming that I want a german domain name, am I truely going to
> > limit myself to the ß spelling?  I cannot imagine, whether I was
> > masse.de or maße.de, not registering both names.  If nothing else,  
> I
> > may have swiss customers.
> >
> > * If everyone is going to "bundle" the names anyway, then why break
> > 2003 just to force them to bundle.  Sure, the registrars may not
> > prebundle the names, but as a user I certainly would.
> >
> > * What happens to the existing ss names that should be spelled ß?
> > What happens when someone else beats fussball.de to fußball.de?   
> Are
> > we going to force the registrars to pre-bundle or block those URLs?
> >
> > Some of those points must be new points, or at least ignored
> > points.  If nothing else the last question "how do we expect
> > existing fußball mappings to be migrated" hasn't been answered that
> > I'm aware of.
> >
> > The goal of name services is to provide a pointer to a server.
> > IDNA2003 already allows that with ß.  Maße.de goes to a server.   
> It
> > is nearly irrelevent that masse.de also goes to the same server
> > since AAA.com for the auto club and AAA.com for my local plumbers
> > also would go to the same server.  Same with "shift" or any English
> > homograph.  It cannot possibly be a requirement that everyone get
> > their perfect name.
> >
> > Assuming that ß were allowed and someone wants to register
> > "Maße.de", what would happen if it were already taken?  Worst case
> > they'd say "bummer, I need a different name" and try again with
> > something else.  Is that so bad?  It happens thousands of times a
> > day in ASCII.  At this point it is pretty irrelevent whether they
> > were blocked by masse.de or an existing maße.de.
> >
> > The only problem I see with ß and IDNA2003 is that if I do a  
> reverse
> > DNS lookup on the name, then I end up with the non-standard (unless
> > I'm swiss) spelling.  Still common enough that users can easily read
> > it, but non-standard enough that spelling teachers would likely
> > frown.  I am not at all sure that solving that reverse lookup/
> > display problem is worth breaking IDNA2003 and confusing Swiss-
> > German / German-German interoperability.
> >
> > IDNA2003 "freaked out" with a minor bug in the Unicode normalization
> > standard that caused a minor ambiguacy in character sequences that
> > weren't even linguistically possible, in any language.  The response
> > was to get Unicode to create Corrigendum #5 for UAX#15 because the
> > breaking behavior was intolerable.  Yet breaking the behavior of
> > IDNA2003 for an existing sequence that is commonly used  
> (fußball.de)
> > and valuable, at least to some users, is acceptable?
> >
> > So, specific, new questions:
> > * Do we believe that users (or registrars) are NOT going to bundle
> > these names anyway when getting new names?
> > * How is migration going to happen for users like fussball.de?
> > * Do we feel that there are going to be no Swiss-German/German-
> > German interoperability issues because of this change?
> >
> > -Shawn
> >
> > Note: There was a question regarding ss and ß in comparisons.  Both
> > Google and bing seem to return results for either spelling.  (I
> > think the compare equal there, or else I was just getting lucky and
> > keywords are added in both forms).
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/idna-update
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090712/736d0a67/attachment-0001.htm 


More information about the Idna-update mailing list