Mapping and Variants

Michael Everson everson at
Tue Mar 10 09:06:49 CET 2009

Oh my gods.

Are we back HERE, at THIS decision?

On 10 Mar 2009, at 05:18, Michel SUIGNARD wrote:

> +1 on Mark's message concerning confusability.
> I also think that script mixing within a label should be a client
> application decision, not dictated by protocol.

This is madness. I said this first when Cary started talking to me  
about this, when he was editing a draft when WG2 was at Sophia  

At that time, the idea that Cyrillic and Greek and Latin and Cherokee  
could be permitted to intermix within a script label horrifies me --  
unless the idea is to say "feck it, we don't care about being  
responsible for enforcing any security whatsoever".

Was a decision to ban script-mixing within a label made? Or was it not  
made? If it was not made, I am surprised, as I thought it had been. If  
it was made, why the hell is it being proposed to unmake it?

> For many scripts it is in fact innocuous and desirable to be mixed  
> with ASCII Latin (take Japanese and Romaji for example). In my days  
> at Microsoft, when helping exposing IDN in IE7, we went from a  
> fairly restrictive model to a much more open model concerning script  
> mixing, clearly banning the problematic cases (such as Greek,  
> Cyrillic, Latin mixing), but allowing for example most of the Asian  
> scripts to be mixed with Latin, and
> obviously allowing the mixed script scenarios required for Japanese  
> and Korean.


> Finally the script property as exposed by Unicode cannot be used  
> without
> some careful analysis to determine 'single' script. There are values
> such as 'Common' and 'Inherited' which have to be allowed with most
> other script values.

Give examples when you make a statement like this please. Otherwise it  
is scare tactics.

> At the same time, 'Common' is a value that often means 'shared' by  
> at least two scripts, and it does not mean that all 'Common'  
> characters should be mixable with all scripts.


> In other words, it is way too complicated to be enshrined in a  
> protocol
> where stability is a feature.

You have to make arguments by reference to examples that specify your  
concern. Even I, Unicadette that I am, don't find your argument  

> It is better done by registry policies and client application  
> awareness. And it needs to be adjusted as new threats emerge while  
> respecting real need for multi-script labels when no harm potential  
> exists.

Even mixing Burmese and Latin is dangerous because of Latin o and  
Burmese wa (looks like o).

You know, last night I sent an IM to Cary:

"I don't know why I remain on the IDNA list. Any time I say anything  
it gets ignored."

Cary responded that he felt that both statements were true for  
everyone on the list.

And these decisions will help run the internet....


More information about the Idna-update mailing list