Stop me if I've misunderstood...

Shawn Steele Shawn.Steele at microsoft.com
Thu Jul 9 00:22:55 CEST 2009


> It seems to me that offering some guidelines for thinking about what
> to do with characters that are, in the context, obviously similar (but
> which might not always be similar all the time, in every context) is a
> reasonable thing to do.  But there doesn't seem to be any way to turn
> that reasonable context-dependent thing into a universally-quantified
> rule:

There is one, singular, context, and that is as Domain Names that get you to a server.  That is consistent.  Any other context isn't IDN.

IF we permitted contexts like Turkey and US, then it's broken anyway.  A Turkish context would turn I into ı and İ into i and break the US context.  A Turkish visitor in a US airport or US visitor in a Turkish airport, or a friendly user assistance emailing a MICROSOFT support URL to a Turkish user would all break.  There MUST be a consistent mapping, there is no place for something like "multiple contexts" in IDN.

This requirement forces some fairly annoying inconveniences on some users.  It makes it hard for Turkish users to get the expected behavior, which is unfortunate.  Were I to start from scratch I might try making things like all 4 I's point to the same i.  That limits the possible variations, but at least everyone needing a name like that will get to a server.  There'd also be a problem of which I to display, but that happens with other strings like fuβball. Were I starting from scratch, I'd also try to figure out a way to allow owners of names specify a form for presentation as well.

Multiple variations, specifically by culture, is the paypal security problem, but worse.  Both would be quite legitimately linguistically correct in their culture, whereas at least mixing Cyrillic and ASCII in paypal is arguable silly.

-Shawn



More information about the Idna-update mailing list