el-latn, ru-latn, and related possibilities

Tex Texin tex at yahoo-inc.com
Wed Oct 5 21:24:35 CEST 2005

Guys, sorry to be the odd man out yet again, but we should first run through all the use cases before deciding that transliteration can be pushed down the stack. This argument sounds to me more like a rationalization for continuing with 3066bis than to really address the question.

Text to voice is important for accessibility. Identification of the transliteration scheme would be a prominent requirement and perhaps therefore ru-Latn is not sufficient and should not be recommended as adequate.

Also, if we buy the argument that script was important enough to break compatibility with lang-region, and to instead associate script with language as lang-script-region, I would think we would want transliteration to also be tied with script and not go after region.

Something like zh-hans-pinyin-cn rather than zh-hans-cn-pinyin.

What am I missing?
(yeah, yeah, marbles I know...)

Tex Texin
Internationalization Architect,   Yahoo! Inc.

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no 
> [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of 
> Peter Constable
> Sent: Sunday, October 02, 2005 10:35 PM
> To: IETF Languages Discussion
> Subject: RE: el-latn, ru-latn, and related possibilities
> > From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages- 
> > bounces at alvestrand.no] On Behalf Of John.Cowan
> > > The question arises is as follows: just as de-de-1901 and all the
> > similar
> > > subtagging exists for German, might there be advantages in 
> > > registrations (or alternatively a mechanism which avoided 
> the need 
> > > for registrations) which listed widely used transliterations into 
> > > Latin?
> > 
> > We haven't yet settled whether (when we are fully in the RFC 3066bis
> > regime) we should handle transliterations as simple variants (like 
> > historical orthographies, geographical and social dialects, and the
> > like) or via the new extension machinery of RFC 3066bis, 
> which is more 
> > work to set up but is more general-purpose.  In any case, someone 
> > would have to maintain a registry of transliterations, 
> transcriptions, 
> > and orthographies: no one has taken on that job yet.
> IMO, these should be variants and not extensions. Treating 
> them as extensions would require a separate specification and 
> entail that they would not be supported in protocols and 
> specifications that reference RFC 3066bis unless specifically 
> revised to reference the extension RFC as well -- which would 
> be an incredible pain. So, for instance, while there's 
> reasonable likelihood of expecting XML would be updated to 
> reference RFC 3066bis, so that those tags could be used for 
> xml:lang, it's far less likely that a extension RFC would be 
> referenced by XML, meaning that transliterations could not be 
> distinguished in XML lang.
> I'm completely convinced that transliterated text in some 
> language such as Russian can simply be treated as 
> Russian-language data that is written in an alternate written 
> form. For some purposes, tagging it as "ru" may be 
> sufficient; for others, tagging it as "ru-Latn" may be 
> needed, and for some purposes tagging it as (say) 
> "ru-Latn-iso9r95" (assuming ISO 9:1995) may be appropriate. 
> This nicely allows for useful degrees of specificity, and is 
> completely adequate for indicating the distinguishing 
> characteristics of the language and written form of the data. 
> There's absolutely no reason I can see for complicating this 
> either syntactically with an extension and its corresponding 
> singleton, or procedurally by requiring a completely separate 
> registration process to be set up. The only possible benefit 
> of an extension would be the possibility of creating some 
> means for generative creating of tags that allow for 
> transliteration standards from some defined group of sources, 
> such as ISO, but I think the potential benefits are limited 
> and the cost high. 
> My vote, therefore, would be to simply treat specific 
> transliteration schemes via the variant subtag as defined in 
> RFC 3066bis.
> Peter Constable

More information about the Ietf-languages mailing list