el-latn, ru-latn, and related possibilities

Thu Oct 6 03:23:57 CEST 2005

> From: Tex Texin [mailto:tex at yahoo-inc.com]

> Guys, sorry to be the odd man out yet again, but we should first run
through all the
> use cases

Certainly.

> Text to voice is important for accessibility. Identification of the
transliteration scheme
> would be a prominent requirement and perhaps therefore ru-Latn is not
sufficient and
> should not be recommended as adequate.

Nobody said that ru-Latn would be adequate for all applications. There
may be *some* applications for which it's adequate. I could certainly
imaging someone looking for transliterated Russian content without
particularly caring what transliteration scheme was used (or even caring
to know it was considered "transliteration" rather than merely Russian
in Latin letters). For *that* scenario, ru-Latn would certainly be
adequate. But there's no doubt that in a text-to-voice application
something more specific would be required -- and supportable in 3066bis.

> Also, if we buy the argument that script was important enough to break
compatibility
> with lang-region, and to instead associate script with language as
lang-script-region, I
> would think we would want transliteration to also be tied with script
and not go after
> region.
> 
> Something like zh-hans-pinyin-cn rather than zh-hans-cn-pinyin.

(I presume you meant zh-Latn-pinyin-cn / zh-Latn-cn-pinyin.)

The reason for putting script as the second subtag was that it would
typically be far more important for a user to get content in a
particular script than in a particular dialect or spelling variant. When
you get down to the level of selecting between one transliteration
scheme over another, I think the level of concern goes *way* down: 

- Text in the wrong transliteration scheme will likely still be legible
(and even minimally understandable in text-to-speech), while text in the
wrong script will likely be quite illegible.

- It has already been questioned how widespread the need for negotiation
wrt transliteration will be. For this issue you raise to be of any
concern, we have to be looking at preferences for a particular
transliteration scheme *and also* a particular dialect (regional
spelling variant won't be a factor for transliterations -- that would
amount to a new transliteration scheme). Moreover, it would only be a
significant concern if it was clear that most users in this scenario
would be far more concerned to get the dialect right than to get the
transliteration scheme right. IMO, we're talking about highly
hypothetical scenarios here, and it's not possible to say that one is
clearly more important than the other. But lets suppose that there is
*some* user scenario where people really need to get the dialect right
rather than the transliteration scheme. Here we're surely talking about
a specialized scenario, and the Language Tag Matching spec that LTRU is
preparing will describe means that an application can use that will
achieve that end. But I really don't think it's a concern for the
widespread implementations of left-prefix matching algorithms, which
were the main reason for putting script before region.

Peter Constable