Mixing scripts (Re: Unicode versions (Re: Criteria for exceptional characters))

Michael Everson everson at evertype.com
Wed Dec 20 00:19:29 CET 2006


At 14:24 -0800 2006-12-19, Kenneth Whistler wrote:

>And the complexity of the mixed script detection heuristics is one 
>of the reasons why none of us (except Michael) is suggesting that it 
>be incorporated as part of the IDNAbis protocol definition per se.

The reason I suggest it is that it is SAFER. If the actual protocol 
actually disallows mixing in labels, rogues cannot spoof. AT ALL. And 
I mean even in other levels of the label, like preventing 
http://paypal.evertype.com with a Cyrillic <a> or <p>. If the actual 
protocol bans this  behaviour (apart from permitted mixing as for 
Japanese or Korean), the Internet is made safer.

There are no very strong linguistic arguments against this. A few 
edge cases (and they really are marginal in 6000 languages) where a 
Latin orthography borrows a letter from Greek or Cyrillic can be 
handled by cloning in encoding in the UCS. Yes, some people don't 
like this, but Oo Oo Oo all look identical whether Latin or Cyrillic 
or Greek and are a lot higher frequency than, say, Cyrillic Ww or 
Latin Theta.

Indeed, I believe that this solution addresses Vint's concern that 
things be simple and deal with what is "needed for expressive natural 
language".

The only reason I can think of for allowing the mixing of Latin, 
Cyrillic, and Greek letters is to allow registrars and/or registries 
make bucketloads of money on such mixed labels. They already make 
bucketloads of money. That's good for them. It's good for the 
registrands. Commerce is important, but safety is more important, not 
only because lack of safety threatens the commerce.

I may be an idealist. But if we are the architects of this thing 
which is going to have worldwide ramifications, we should make the 
PROTOCOL ban things which we know are bad, and not hope for the good 
will of people since there are people who will cheat. I may be the 
ONLY person who thinks that what I've proposed is a good idea. I 
believe that human beings are smart and that if there was a will to 
do it, it could be done, even if whoever owns I-heart-new-york.com 
loses it when it expires due to safety regulations.

I fear that if we (those responsible for trying to sort out IDN) 
don't do this, it will bite us (the users of the internet) with very 
poisonous fangs.

I may be, or become, unpopular for saying this, but I was invited to 
participate in this work because I would not shy from saying such 
things.

With best regards,
-- 
Michael Everson * http://www.evertype.com


More information about the Idna-update mailing list