Tagging transliterations from a specific script

Avram Lyon ajlyon at ucla.edu
Thu Mar 17 06:38:15 CET 2011

2011/3/17 Peter Constable <petercon at microsoft.com>:
> Another way to look at this: if you declare "tt-Arab-alalc92", you are by implication also declaring "tt-Arab". As Martin indicates, the latter contradicts any declaration of using alalc92.

Indeed, I certainly assume that tt-alalc97 is necessarily
tt-Latn-alalc97 -- any other interpretation would turn on its head
everything I've assumed about language subtags in my own processing of

Can anyone comment on the solution I floated of introducing subtags
for the real language variants that define Tatar in the Arabic-script
period? It seems workable to me, but the critical eyes are always more
alert to pitfalls of language metadata proposals than I am.

Again, I would propose:
tt-Arab-iskeimla  <=> tt-iskeimla:
Tatar in the traditional orthography based on Persian and Chagatay,
first standardized by Qayumi Nasiri in the early 19th century and,
subject to individual variation, in use until about 1920, and used
within the Tatar diaspora.
tt-Arab-yanaimla <=> tt-iskeimla:
Tatar in the revised orthography introduced in 1920, with more
explicit marking of vowels and vowel quality, in use until the
introduction of Janalif.

So my romanizations from Arabic script Tatar would be:
tt-iskeimla-alalc97 (equivalent to tt-Latn-iskeimla-alalc97)
 -- depending on the original source orthography.

If this approach makes sense, we can get away from the messy business
of how transliterations work, and back to the more palatable task of
debating the merits of these language variant subtags as such-- I
think that they have utility in their own right.


More information about the Ietf-languages mailing list