Here comes the Yiddish
Thu, 5 Dec 2002 11:38:37 -0600

On 12/02/2002 01:06:35 AM John Cowan wrote:

>Doug Ewell scripsit:
>> By defining 4-letter second subtags to be script codes, in some future
>> revision to RFC 3066, it would become unnecessary to register special
>> tags like yi-hebr and yi-latn.  This situation will come up again and
>> again (e.g. az-Cyrl and az-Latn).
>Oh, come on.  What is this vast supply of languages written in more than
>one script?  The situation is rare.  Sanskrit might be the record-holder
>for max number of scripts, but I doubt if there are more than 25-35
>languages that are written in more than one script.
>Not a huge problem.

I suspect the problem is rather bigger than that. There are lots of
situations in which languages span national borders or are shared by
distinct ethnic groups resulting in a need for multiple scripts. This is
actually quite common. In Ethiopia, there are several languages written
using both Ethiopic and Roman scripts. Across the souther Sahara, there are
a number of languages being written in both Arabic and Roman (and throw in
Tifinagh for some). We're all familiar with the transitions between
Cyrillic, Arabic and Latin in Central Asia. In SE Asia, there are many
languages that cross national borders. I've got users telling me they may
need to publish documents in four or even five writing systems (all the
same language).

Moreover, we should not just be thinking in terms of scripts here. The real
issue is identifying writing systems. As Jon Hanna pointed out most
languages are written at some point in more than one form. Take English,
for example: an English text may use the standard English writing system
that is common to all English varieties; but that English text might also
be in phonetic transcription, or in Braille, or in some kind of shorthand.
These are all the same language, just different writing systems. We need to
be able to tag data to identify not only the language but also the writing
system. (How long have we needed a better way to distinguish Simplified vs.
Traditional Chinese text than zh-CN vs. zh-TW?)

- Peter

Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485