Mixing scripts (Re: Unicode versions (Re: Criteria for exceptional characters))

Harald Alvestrand harald at alvestrand.no
Tue Dec 19 13:47:48 CET 2006


Michael Everson wrote:
> At 09:45 -0800 2006-12-18, Mark Davis wrote:
>> I don't think it is necessary. If mixtures of scripts are not 
>> displayed (eg the user-agent flags them as discussed before), then 
>> they are not a problem. If mixtures of scripts *are* allowed, then 
>> there are so many other problems (eg with Cyrillic) that these pale 
>> in comparison.
>
> I do not understand why (apart from Japanese and Korean) the 
> possibility of mixing scripts is still being discussed as though it 
> were necessary. DON'T mix Syllabics with other scripts, and be done.
We still have to sort out a definition of "script" that makes the 
statement "Don't mix scripts" an actionable statement.

If we use the Unicode script names from the Unicode database's 
"Scripts.txt", 0-9 aren't in "Latin", they're in "Common".
The combining accents are mostly in "Inherited".

If, by "don't mix scripts", you mean "scripts Common, Inherited and X 
from unicode/Scripts.txt can be mixed in one string, for any value of X, 
but no other mixing is allowed", we can discuss that statement. But I'm 
not at all sure we're all talking about the same thing when we discuss 
the statement.

Is there a list of the Unicode codepoints known to be used in each of 
the ISO 15924 script codes?

Harald



More information about the Idna-update mailing list