The Future of IDNA

Erik van der Poel erikv at google.com
Fri Mar 20 03:22:47 CET 2009


>> > O.k., I'm a Unicode and language expert on this mailing list. *I*
>> > think mapping tonos away in the protocol is a bad idea. That is
>> > the kind of equivalencing that *should* be done by bundling
>> > (if required).
>>
>> Do you have first-hand experience with the difficulty of bundling?
>
> Of course not. What a silly question. I'm not a zone administrator.

It is not a silly question. You are asking many modern Greek
registrants to feel the pain of bundling just because you want a few
classical Greek registrants to be able to register names with/without
tonos separately.

Since you do not know how painful it is to bundle, you should not be
so quick to suggest doing it forever.

Under my proposal, the bundling would not be forever. It would
strictly be a transition tactic. Display would be achieved via
http://<domain-name>/idnproto.txt.

> But heading down this path of mapping *this* kind of information
> in the protocol is a one-way ticket to hell, IMO. It leads
> directly to the question of mapping simplified and traditional
> forms of Han characters to each other, which is a much, much
> bigger and less tractable problem

Who is talking about mapping for simplified and traditional Han?

> than ignoring a few accents
> in Latin, Greek, or Cyrillic.

Who is talking about ignoring accents in Latin or Cyrillic?

Again, under my proposal, Greek tonos would be displayed.

>> By the way, what languages require the separate registration of names
>> that differ only in the presence or absence of tonos? And how large
>> are those communities?
>
> Well, *Classical* Greek. Or for that matter any use of polytonic
> Greek, where the point is to use the accents to make distinctions.
>
> Of course it is pointless to ask how large is the Classical Greek
> "community", in one sense, because we are talking about historic
> usage, for the most part. But if you map away accent distinctions
> for Greek letters by *protocol*, then you preclude the possibility
> that someone could want to make a label distinction on that
> basis -- just as they might for accented Latin letters.

Again, who is suggesting that we strip accents from Latin?

> As Andrew pointed out, you make life easier for some, but you
> end up goring somebody else's ox.

Would you rather cause pain for modern Greek users than
classical/polytonic Greek users?

> And we just haven't got enough time slices left in the decade to
> parse all the nuances of one community's preferences against
> anothers on a language-by-language and character-by-character
> basis.

Why do we have to finalize the IDNA mapping spec this decade? We can
come up with a spec that includes IDNA2003's mappings plus tonos
mappings this year. We can add and remove a few other mappings three
years after that. And so on.

> Nor would attempting to do so make the protocol functionally
> better, IMO. All it would accomplish would be to make it
> more complicated, more subject to arbitrary (and not so
> arbitrary) political complaints about X's interests versus
> Y's interests, and most of all, would delay its approval
> mightily.

See above. An initial mapping spec can be written and implemented this
year. Further additions and removals could be highly politicized, yes.
But that might just be a good thing, because there would be a healthy
delay from version to version.

>> Yes, it is in the opposite direction, and I have outlined a transition
>> strategy for adding or removing mappings. Is the strategy unclear?
>
> I think a transition strategy for adding or removing mappings
> is inherently flawed.

Why? Please remember that it would only affect a small number of
mappings at each version.

> Either we go with IDNA 2008 pretty much as currently defined,
> take the one-time transition hit, and hope that the Unicode
> Consortium (or somebody) can provide a preprocessing for
> maximal IDNA 2003 interoperability.

If the mapping spec is outside the main IDNA protocol, there is a
chance that some clients might not implement them. This would mean
that the Greeks would have to bundle forever.

> Or we go with Paul's approach and attempt to maximize the
> interoperability with IDNA 2003 as the highest priority for
> the revised protocol design.

The Germans want Eszett and several communities want ZWJ and ZWNJ.

>> Should I provide more details?
>
> If you wish, but I'm simply not convinced that the direction
> is fruitful at this point.

I have an Internet Draft, but I need to change it a bit before I can
send it to this list.

Erik


More information about the Idna-update mailing list