Consensus Call on Latin Sharp S and Greek Final Sigma

Shawn Steele Shawn.Steele at microsoft.com
Tue Dec 1 10:20:03 CET 2009


> I claim we get more confusion if the mapping that happens
> is different _for the same user_ than what the user is used to
> than for example to have the same mapping for two different users.

That's exactly why there MUST be consistent mappings.  It's far worse for the same user to have different behavior just because they're using an airport kiosk computer instead of their local language.  That'd be a phisher's dream scenario.  At least with consistent behavior that differs from the users natural expectations it won't change just because of some environmental thing.

> And as some people pointed out, the color / colour issues and similar that already exists.

And if I visit color.com, I don't expect it to work for colour.com if I happen to be in Canada, the BVI, Australia or wherever.  That's because the rules are consistent.  If color.com thinks it's important enough, then they could also register colour.com (and some have sued to get names like that when they're close enough and backed by trademark laws).  If the IETF had made "o" == "ou", then that would've worked too.  I couldn't register "could.com" if "cold.com" was already taken, but it wouldn't matter, the rules would be consistent.  (& I can't register either of them anyway because squatters have them.)

> (I have a key for 'ä' on my keyboard while others might have to press '¨' and then 'a'.)

IME is a different layer I think.  If you see äpfle on a bus and type it in, you should still get to the same place whether you use an NFD Mac or an NFC PC.

> The mapping specifications, the real ones, can be developed in 
> whatever SDO that is out there. W3C? Unicode Consortium? 
> What I think I am more and more certain on is that IETF is not the correct venue.

I agree on that!

> Now, the problem I think is *NOT* the mappings, but as you 
> say Shawn, how to *specifically* handle Sharp S and final sigma.

> We have two alternatives for the core protocol:

> 1. Have it as PVALID
> 2. Have it as DISALLOWED

This'll be funny, since I've been so vocal, but I don't really think it matters much.

The registrars that are interested in ß will probably bundle it with ss (I believe .at and .de have said as much).  I also expect companies that care about ß to register ss if their registrar doesn't do it automagically.  In fact if ß is interesting, then I'd expect lawsuits to get the alternate form when business is involved.  Certainly fussball.de would probably be pretty miffed if fußball.de went somewhere else.

There's probably a small set of users where Herr Fuss doesn't really care to bother with both forms, but those won't be commercial users, and actual cases where both go different places will probably be very low.

For fuss.us that doesn't realize ß even exists, then it won't matter whether ß is PVALID or not.  (Even if fuss.us doesn't care, nobody's going to brand themselves fuß.us and risk the collision, unless it's trademarked an they expect to get fuss.us as well.)

So long term I expect pretty much no real impact, except that there'd be a round-about mechanism for someone to specify their preferred ß display name.  In the short term there's risk of spoofing while multiple implementations exist, and I'd really prefer a more robust preferred display name form (that could handle CamelCase.Com or AAA.com as well).

DNS is supposed to be about finding machines.  There's no guarantee that any specific name is available for your use.  Even with IDN some names are illegal.  The Seattle Times has to settle for a form without a space, etc.  It isn't interesting that ss and ß go to the machine at IPv4 123.456.789.012.  It IS interesting that the users of those characters can get their preferred display form to work.

> Even in language contexts like Swedish where it is not (ß is just weird,
> but it is definitely not the same as ss).

So?  It's a label that gets you somewhere.  If that place gives back a sensible display form, who cares?  With NFKC in IDNA2003 there's lots of strange mappings that end up back at a more normal place.  Who even knows how to type that ae character anyway?  Don't use the weird form.  (Yea, I know that the new mapping doc is less permissive than IDNA2003, but why does it matter?)

> So for me this is a question of choice the domain holder has.
> Can they choose to differentiate between ß and ss or not?

No.  If I tried to register fußball.de, I'd get sued.  There's no real choice here.  The ability to register both forms is an illusion in this particular case.  It doesn't matter what a linguist says is proper, or what subtleties the IETF allows, in practice the functionality we're enabling can't be used.

So I'm not vocal because I think that ß == ss.  I'm vocal because I think it is a bunch of churn and risk that has no real long-term impact or value (unless it's a stepping stone to the Ecole & ecole both allowed position).  Long term the end-user will still have the same experience no matter which choice is made.

> We should not destroy and make it impossible to use ß 
> in domain names

Not impossible to use ß, it just goes to the same place as ss, and it should allow for an ß display form (yes, I know that piece is missing).

-Shawn


More information about the Idna-update mailing list