Mixing scripts (Re: Unicode versions (Re: Criteria
forexceptional characters))
Michael Everson
everson at evertype.com
Sun Dec 24 19:14:52 CET 2006
At 12:06 -0500 2006-12-24, John C Klensin wrote:
> > It is Kurdish, and the two letters are for other functional
>> reasons being proposed for addition to the standard.
>
>As part of my continued effort to understand which rules apply
>and when, adding them to the standard, and identifying them as
>"Cyrillic" would seem to violate the "unify when possible" rule.
>What am I missing?
Functional requirements, such as sorting monolingual multiscript text.
>If Unicode didn't start
>with ISO 8859-* and a bunch of other locally developed CCSs as
>input and had it started with a strong and consistent
>unification rule, we would certainly have looked at that
>character and say "it doesn't exist independently in 'Latin' or
>'Cyrillic' scripts, it is just an adaptation of a Greek
>character and should be unified with it".
No, never, because of the functional
requirements. One could not expect <o> to sort in
three different places in a multilingual glossary
(Russian, English, Greek).
My point is that those are no different for
Kurdish, which has Latin, Cyrillic, and Arabic
orthographies. Kurdish Cyrillic uses Aa, Ee, Oo,
Öö, and <Schwa><schwa> already which are
identical between Latin and Cyrillic, plus it
uses Qq and Ww. Since we have found Cyrillic Q's
which have a different capital shape than Latin
ones do, it's quite possible that CYRILLIC LETTER
QA will be added, in which case there is but one
straggler, CYRILLIC LETTER WE, and my argument is
that there is no advantage to Kurdish in sticking
to the unification.
But I guess this is not the venue for this
discussion. I understand that a script-ban will
not be deeply embedded.
--
Michael Everson * http://www.evertype.com
More information about the Idna-update
mailing list