Mixing scripts (Re: Unicode versions (Re: Criteria forexceptional characters))

Martin Duerst duerst at it.aoyama.ac.jp
Wed Dec 27 05:20:47 CET 2006


At 00:48 06/12/27, Michael Everson wrote:
>At 10:54 +0900 2006-12-26, Martin Duerst wrote:
>
>>I don't think you understood what I meant. What I meant was the following: Assuming that in Kurdish orthography, e.g. Latin 'A' corresponds to Cyrillic 'A', Latin 'R' corresponds to Cyrillic 'P', Latin 'S' corresponds to Cyrillic 'C', and so on, the mixed sorting that I'm proposing is to sort all Latin 'A's with all Cyrillic 'A's, all Latin 'R's with Cyrillic 'P's, all Latin 'S's with Cyrillic 'C's, and so on.
>
>This causes a visual sea-sickness which people really do not prefer. I have many, many, many books with multilingual indices and scripts are univerally split up.

Even in Japan, indices for Latin and for Kanji/Hiragana/Katakana are
split up.


>>That way, a user can go directly from pronunciation to an entry in the list without having to look in two places (one in the Latin part of the list and one in the Cyrillic part of the list).
>
>Yes, well, people who use alphabets don't like this. It is confusing.

If indeed they prefer to look in two places, that's of course their
choice.


>(I know that the Japanese do interfile the kanas and Kanji. But Japanese as ever is the splendid exception.)

Even more so, they interfile Katakana and Hiragana. I guess the reason
that this doesn't cause sea-sickness is that this is how everyday
Japanese text looks: a mixture of Kanji, Hiragana, and Katakana.
So seeing it mixed in an index isn't a problem at all.

The reason why Latin is separated isn't so much because it would
cause sea-sickness, but because for Latin, the order is abc...z,
and while you can use that for Japanese, and some dictionaries
do, the more usual order that everybody is most familliar with
is a-(i-u-e-o)-ka-(ki-...)-sa-ta-na-ha-ma-ya-ra-wa, according
to the Japanese syllabary table. 

>A mixed-script sort would be a special case, not a default. I have samples of Kurdish-Russian dictionaries with Cyrillic orthography as well as Kurdish-Russian dictionaries with Latin orthography. And Arabic orthography. I really cannot imagine a scenario where people would want these to be mixed.

I cannot immagine it either for a dictionary. What I can immagine
is e.g. a Kurdish-Russian dictionary with Arabic orthography and
Cyrillic equivalents after the main Arabic entry. This makes sense
in particular if the correspondence between the scrips is irregular.

But what this discussion started out with was a name list. For
names, I can immagine that some people prefer to have their
names in Cyrillic, and others in Latin.

Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     



More information about the Idna-update mailing list