Really OT: internationalized email addresses (Was: french orthography (Was: BCP47 Appeals process)

John Cowan cowan at ccil.org
Wed Sep 24 18:29:27 CEST 2008


Mark Crispin scripsit:

> Let me be clear on this: DNS names, email addresses, etc. are machine
> tokens and NOT natural language.  They are properly seen as the
> equivalent of telephone numbers.

That used to be true.  But now people expect things like ibm.com and
hypercalvinismthemovie.com to Just Work, and this is an advantage denied
to people who don't know the romanization system of their country,
and it's even worse where there is more than one.

> The attempt to "internationalize" these tokens will severely damage
> their utility as global tokens.  I challenge anyone here to visually
> inspect a short text string in Unicode and enter the identical string
> on a keyboard.  Nobody, not even the "Unicode experts" can reliably
> do that.  In an attempt to work around that, we talk about such things
> as "stringprep" and "canonicalization" utterly ignoring the fact that
> these are feeble attempts to lock the barn door while the horse it out.

True enough: the problem turned out to be bigger than anyone thought.

> That's not important to my argument.  We're not talking about good or
> unique fit or even accurate fit.

Sure you are, when you are trying to guess the correct romanization of
a local name.  DNS is unforgiving about such things as taipei.gov.tw
versus taibei.gov.tw, but 臺北.政府.臺灣 is unambiguous.  (I may
have got that wrong; I don't speak Chinese.)

> Similarly, a person, upon receipt of a printed email address or DNS
> name, 

"In receipt of" being the critical bit.

> There is a far more sinister agenda at work; to make it impossible
> for these tokens to be used outside the country.  There will be the
> "haves", who have both their "internationalized" email address and
> a global email address using Latin script, and the "have nots" who
> have only an "internationalized" (translation: domestic only) email
> address that nobody outside can access.  Don't think for a moment that
> the stringprep and canonicalization kludges will actually be obeyed.

Is there any actual evidence of this, or is it just your general belief
in total depravity?  In any case, I can type any Unicode character I want.

-- 
He played King Lear as though           John Cowan <cowan at ccil.org>
someone had played the ace.             http://www.ccil.org/~cowan
        --Eugene Field


More information about the Ietf-languages mailing list