Hi. I do like the subtags Avram has proposed here; I have no problem either with a sequence of two subtags ([iskeimla] or [yanaimla] followed by [alalc97]; is Bashkir still going to be listed as a possible prefix?<BR>
If there are others cases of transliterations from a single language, but multiple scripts<BR>
then I would definitelysupport an extension.<BR>Avram Lyon <A title="Tagging transliterations from a specific script" href="mailto:firstname.lastname@example.org?Subject=Re: Tagging transliterations from a specific script&In-Reply-To=<AANLkTim7ZwEon5ku07DZzGoeYPJbgobLzSbR%2B03NBLgB@mail.gmail.com>"><FONT color=#0068cf>ajlyon at ucla.edu </FONT></A><BR>Thu Mar 17 06:38:15 CET 2011 <BR><PRE>> . . .
> Can anyone comment on the solution I floated of introducing subtags
> for the real language variants that define Tatar in the Arabic-script</PRE><PRE>> period? It seems workable to me, but the critical eyes are always more</PRE><PRE>> alert to pitfalls of language metadata proposals than I am.
> Again, I would propose:
> tt-Arab-iskeimla <=> tt-iskeimla:
> Tatar in the traditional orthography based on Persian and Chagatay,</PRE><PRE>> first standardized by Qayumi Nasiri in the early 19th century and,</PRE><PRE>> subject to individual variation, in use until about 1920, and used</PRE><PRE>> within the Tatar diaspora.
> tt-Arab-yanaimla <=> tt-iskeimla:
> Tatar in the revised orthography introduced in 1920, with more
> explicit marking of vowels and vowel quality, in use until the
> introduction of Janalif.</PRE><PRE>O.k., but one quick question: before 1905, was not the language in use actually "Old Tatar"? (Someone who knows tatar should have a quick answer on this. In any case there is no subtag for Old Tatar so it would be tagged [tt] I presume.)
> So my romanizations from Arabic script Tatar would be:</PRE><PRE>> tt-iskeimla-alalc97 (equivalent to tt-Latn-iskeimla-alalc97)
> -- depending on the original source orthography.</PRE><PRE>To me personally it's fine to use these variants with the variant alalc97; the second [alalc97] subtag indicates that the transliteration was done based on the data in the library of congress Romanization tables and using the characters listed at <A href="http://www.loc.gov/catdir/cpso/roman.html">www.loc.gov/catdir/cpso/roman.html</A> <A href="http://www.loc.gov/catdir/cpso/romanization/charsets.pdf">www.loc.gov/catdir/cpso/romanization/charsets.pdf</A> if I'm not mistaken.</PRE><PRE>(I personally would also have no objection to registering an extension to handle the source scripts of transliterations,</PRE><PRE>but think that for only this case alone an extension is not needed. However, there may be some other use cases [Mongolian?]. Also I think Avram mentioned Bashkir as possibly being a prefix here. Is Bashkir also going to be listed as a prefix?)</PRE>Avram Lyon ajlyon at ucla.edu <BR>Mon Mar 14 21:23:5!
6 CET 2011 <BR>> . . .<BR><BR>> I have a short bibliography of orthographic manuals on the early<BR>> Janalif period that confirm that (1) iske imla and yana imla are<BR>> distinct orthographies and (2) yana imla and janalif differ in more<BR>> than script.<BR>I have only found a little online on iske imla and yana imla.<BR>According to: <A href="http://www.pentzlin.com/Missingjanalifcharacter.pdf" target=_blank><FONT color=#0066cc>www.pentzlin.com/Missingjanalifcharacter.pdf</FONT></A><BR>"In the early 1920s Azerbaijanis invented their own Latin alphabet, but Tatarstan scholars set a little store to this project, preferring to reform the Ýske imlâ (en.wikipedia.org/wiki/iske_imla). The simplified Ýske imlâ, known as Yańa imlâ (en.wikipedia.org/wiki/yana_imla) was used from 1920–1927. "-- Wikipedia seems to be the main source of information about these alphabets; <BR><A href="http://www.topix.com/forum/world/macedonia/TRUO6MJ!
0V7P8BHR1E/p8" target=_blank><FONT color=#0066cc>www.topix.com/forum/w
orld/macedonia/TRUO6MJ0V7P8BHR1E/p8</FONT></A> says little about these alphabets, but does bring up Old Tatar which was used in pre-1905 documents according to the info here. <BR>Best,<BR> <BR>--C. E. Whitehead<BR><A href="mailto:email@example.com"><FONT color=#0068cf>firstname.lastname@example.org</FONT></A> <BR>> Then my use case could be tagged as such:<BR>>> * Tatar, transliterated from original Arabic into Latin<BR>> tt-iskeimla-alalc97<BR>> -- OR --<BR>> tt-yanaimla-alalc97<BR>>> * Tatar, transliterated from original Cyrillic into Latin<BR>>> tt-alalc97 [if we're assuming tt-Cyrl]<BR>> -- OR --<BR>> tt-?????-alalc97 [if a new variant tag is assigned for the present<BR>> Tatar Cyrillic orthography, possibly to distinguish it from the<BR>> Krashen Tatar Cyrillic orthography used until ~1939 among Christian<BR>> Tatars, or simply to help us in this specific case.]<BR>> So my thinking is tha!
t a careful look at the changes the languages<BR>> themselves underwent will help us find a way out of this mess-- there<BR>> are authentic language variants that bear more meaning than is<BR>> conveyed by the script subtag.<BR>> Avram<BR> </body>