Eszett

Mark Davis ⌛ mark at macchiato.com
Sun Jul 12 07:55:28 CEST 2009


Quoting from Marcos, message of March 4. My bolding.

...
Breaking backwards compatibility is to my eyes the big stigma of IDNA2008.

So:

e) If mappings are to be removed from the standard, as we thought they
were, then we fall back to our pre 2003 position, that is: we would like ß
to be PVALID (and this is reflected by the current draft situation). Then
there is no havoc anymore, it is up to us as a registry to deal with
eszett, and we'll do it the right way.
* f) But if there is room for negotiation and mappings could be (again) part
of the standard, then we would like eszett to be mapped to "ss" to ensure
backwards compatibility.*

Hope that helps.

Best
Marcos
Mark


On Sat, Jul 11, 2009 at 22:00, Vint Cerf <vint at google.com> wrote:

> Mark,
> this working group has been unable to make progress in part because people
> keep re-visiting matters that the working group agreed earlier was a
> consensus.
>
> At this point, it's my belief and opinion that we should be focused solely
> on mappings-01 and any changes needed to the other documents to accommodate
> the incorporation of mappings-0x into the ensemble of papers that make up
> IDNA2008.
>
> the proposition to specify display practices by mapping on output is
> definitely outside the scope of the protocol and the charter of the working
> group as I read it.
>
> The rule that no PVALID character should be mapped into another PVALID
> character seems like a good rule to follow here so that would apply to upper
> and lower case eszett and final sigma, I think.
>
> Your recollection of DENIC's preference and mine seem to be at odds since
> they appear to me to have said they want eszett to be PVALID and not mapped.
>
> vint
>
>
>
>
> On Jul 11, 2009, at 6:57 PM, Mark Davis ⌛ wrote:
>
> Vint,
>
> I do think it is well worth revisiting the es-zett issue. At the time the
> consensus was polled, the framework was substantially different because we
> had no lookup mappings. It was made clear by DENIC, for example, that they
> wanted es-zett to be in IDNs (everyone does), but preferred having es-zett
> mapped to "ss" for compatibility if we had lookup mappings.
>
> Having hrefs with es-zett link to different sites depending on the
> browser's version of IDNA is a significant compatibility, interoperability,
> and security issue. Ignoring that is likely to represent a major impediment
> to adoption of IDNA2008.
>
> We could get away with changing the lookup remapping for ZWJ/NJ because
> they are in so limited usage currently; that is not the case for es-zett or
> final sigma. And for final sigma, 98% of the value could be accomplished
> without introducing indeterminate lookup mappings. That is, by mapping as in
> IDNA2003 on lookup and adding: "When displaying IDNAs, regular sigmas that
> are final (not followed by letters) SHOULD be converted to final-sigma".
>
> I realize that we are all heartily tired of these topics, but we don't want
> IDNA2008 to be DOA either.
>
> Mark
>
>
> On Sat, Jul 11, 2009 at 15:20, Vint Cerf <vint at google.com> wrote:
>
>> Shawn et al,
>>
>> long ago the working group decided that eszett "ß" would be a PVALID
>> character and thus allowed for registration and lookup.
>>
>> I see no utility in re-visiting that now.
>>
>> What is at issue is what kind of pre-lookup mapping(s) are going to be
>> drawn to implementor attention.
>>
>> The draft mappings-01 provides an example but does not speak to the
>> RFC2119 language of MUST, SHOULD, or MAY.
>>
>> I think the focus of the discussion should be whether we can refine
>> mappings-01 language to the point that we can include such
>> recommendations.
>>
>> My sense is that some of us believe that to achieve this goal, more
>> needs to be said about under what conditions these mappings should be
>> applied or could be ignored.
>> One possible approach is to add language explaining in the document
>> what the side-effects of ignoring or using the proposed pre-lookup
>> mappings might be.
>>
>> I also believe that the WG has demonstrated a belief that some mapping
>> always takes place as strings containing non-LDH characters are
>> obtained from users (one way or another) and before these strings are
>> in a form suitable for direct lookup in the DNS.
>>
>> Our problem seems to be how to characterize when such mappings may
>> take place and what they are. For example, I think it is clear that
>> any string that can be looked up must be in NFC form, and if it does
>> not start out that way, it has to be transformed before the DNS can be
>> used.
>>
>> Could we try to focus on this last bit so we can finalize the documents?
>>
>> vint
>>
>>
>>
>>
>> On Jul 11, 2009, at 5:58 PM, Shawn Steele wrote:
>>
>> > I'm definitatly an individual now :)
>> >
>> >> What do you mean by 99% linguistically equivalent? In the new German
>> >> orthography, the difference between ß and ss is a very clear phonetic
>> >> difference (long preceeding vowel for ß, short preceeding vowel for
>> >> ss).
>> >
>> > I agree completely that eszett is a letter and that the "new
>> > orthography" give clear and concise rules for when ss and ß are
>> > supposed to be used in German, in Germany since 1996.
>> >
>> > In practice, how is fussball.de pronounced?  Of course that site
>> > likely picked ss instead of the correct spelling because ASCII
>> > didn't allow eszett.  I think most Germans would recognize ss
>> > instead of ß as a widely recognized alternate spelling of words
>> > using eszett.
>> >
>> > IDN is attempting to support International domain names, hopefully
>> > consistently.
>> >
>> > * Eszett has a long history of alternately being spelled with ss.
>> > You even used to see "strasse" on some German street signs (yes,
>> > spelling reform has caused signs to be reprinted)
>> >
>> > * What happens to a business with a pre-1996 spelling of its name?
>> >
>> > * The IDN standard is, I believe, International.  A swiss user can
>> > type fussball.com and a german user type fußball.com<http://fussball.com>.
>>  Are we really
>> > taking the stand that the same word, in the same language, should go
>> > to a different place just because someone might confuse masse with
>> > maße?  We never promised all words would be registered.  I can't
>> > register "shift.com" for garmets, "shift.com" for a worker's
>> > overtime rights group, and "shift.com" for a moving organization.
>> >
>> > * Assuming that I want a german domain name, am I truely going to
>> > limit myself to the ß spelling?  I cannot imagine, whether I was
>> > masse.de or maße.de <http://masse.de>, not registering both names.  If
>> nothing else, I
>> > may have swiss customers.
>> >
>> > * If everyone is going to "bundle" the names anyway, then why break
>> > 2003 just to force them to bundle.  Sure, the registrars may not
>> > prebundle the names, but as a user I certainly would.
>> >
>> > * What happens to the existing ss names that should be spelled ß?
>> > What happens when someone else beats fussball.de to fußball.de<http://fussball.de>?
>>  Are
>> > we going to force the registrars to pre-bundle or block those URLs?
>> >
>> > Some of those points must be new points, or at least ignored
>> > points.  If nothing else the last question "how do we expect
>> > existing fußball mappings to be migrated" hasn't been answered that
>> > I'm aware of.
>> >
>> > The goal of name services is to provide a pointer to a server.
>> > IDNA2003 already allows that with ß.  Maße.de goes to a server.  It
>> > is nearly irrelevent that masse.de also goes to the same server
>> > since AAA.com for the auto club and AAA.com for my local plumbers
>> > also would go to the same server.  Same with "shift" or any English
>> > homograph.  It cannot possibly be a requirement that everyone get
>> > their perfect name.
>> >
>> > Assuming that ß were allowed and someone wants to register
>> > "Maße.de", what would happen if it were already taken?  Worst case
>> > they'd say "bummer, I need a different name" and try again with
>> > something else.  Is that so bad?  It happens thousands of times a
>> > day in ASCII.  At this point it is pretty irrelevent whether they
>> > were blocked by masse.de or an existing maße.de <http://masse.de>.
>> >
>> > The only problem I see with ß and IDNA2003 is that if I do a reverse
>> > DNS lookup on the name, then I end up with the non-standard (unless
>> > I'm swiss) spelling.  Still common enough that users can easily read
>> > it, but non-standard enough that spelling teachers would likely
>> > frown.  I am not at all sure that solving that reverse lookup/
>> > display problem is worth breaking IDNA2003 and confusing Swiss-
>> > German / German-German interoperability.
>> >
>> > IDNA2003 "freaked out" with a minor bug in the Unicode normalization
>> > standard that caused a minor ambiguacy in character sequences that
>> > weren't even linguistically possible, in any language.  The response
>> > was to get Unicode to create Corrigendum #5 for UAX#15 because the
>> > breaking behavior was intolerable.  Yet breaking the behavior of
>> > IDNA2003 for an existing sequence that is commonly used (fußball.de<http://fussball.de>
>> )
>> > and valuable, at least to some users, is acceptable?
>> >
>> > So, specific, new questions:
>> > * Do we believe that users (or registrars) are NOT going to bundle
>> > these names anyway when getting new names?
>> > * How is migration going to happen for users like fussball.de?
>> > * Do we feel that there are going to be no Swiss-German/German-
>> > German interoperability issues because of this change?
>> >
>> > -Shawn
>> >
>> > Note: There was a question regarding ss and ß in comparisons.  Both
>> > Google and bing seem to return results for either spelling.  (I
>> > think the compare equal there, or else I was just getting lucky and
>> > keywords are added in both forms).
>> > _______________________________________________
>> > Idna-update mailing list
>> > Idna-update at alvestrand.no
>> > http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090711/bd2382f4/attachment-0001.htm 


More information about the Idna-update mailing list