Tagging transliterations from a specific script
cewcathar at hotmail.com
Thu Mar 17 22:41:55 CET 2011
Hi. I do like the subtags Avram has proposed here; I have no problem either with a sequence of two subtags ([iskeimla] or [yanaimla] followed by [alalc97]; is Bashkir still going to be listed as a possible prefix?
If there are others cases of transliterations from a single language, but multiple scripts
then I would definitelysupport an extension.
Avram Lyon ajlyon at ucla.edu
Thu Mar 17 06:38:15 CET 2011 > . . .
> Can anyone comment on the solution I floated of introducing subtags
> for the real language variants that define Tatar in the Arabic-script> period? It seems workable to me, but the critical eyes are always more> alert to pitfalls of language metadata proposals than I am.
> Again, I would propose:
> tt-Arab-iskeimla <=> tt-iskeimla:
> Tatar in the traditional orthography based on Persian and Chagatay,> first standardized by Qayumi Nasiri in the early 19th century and,> subject to individual variation, in use until about 1920, and used> within the Tatar diaspora.
> tt-Arab-yanaimla <=> tt-iskeimla:
> Tatar in the revised orthography introduced in 1920, with more
> explicit marking of vowels and vowel quality, in use until the
> introduction of Janalif.O.k., but one quick question: before 1905, was not the language in use actually "Old Tatar"? (Someone who knows tatar should have a quick answer on this. In any case there is no subtag for Old Tatar so it would be tagged [tt] I presume.)
> So my romanizations from Arabic script Tatar would be:> tt-iskeimla-alalc97 (equivalent to tt-Latn-iskeimla-alalc97)
> -- depending on the original source orthography.To me personally it's fine to use these variants with the variant alalc97; the second [alalc97] subtag indicates that the transliteration was done based on the data in the library of congress Romanization tables and using the characters listed at www.loc.gov/catdir/cpso/roman.html www.loc.gov/catdir/cpso/romanization/charsets.pdf if I'm not mistaken.(I personally would also have no objection to registering an extension to handle the source scripts of transliterations,but think that for only this case alone an extension is not needed. However, there may be some other use cases [Mongolian?]. Also I think Avram mentioned Bashkir as possibly being a prefix here. Is Bashkir also going to be listed as a prefix?)Avram Lyon ajlyon at ucla.edu
Mon Mar 14 21:23:56 CET 2011
> . . .
> I have a short bibliography of orthographic manuals on the early
> Janalif period that confirm that (1) iske imla and yana imla are
> distinct orthographies and (2) yana imla and janalif differ in more
> than script.
I have only found a little online on iske imla and yana imla.
According to: www.pentzlin.com/Missingjanalifcharacter.pdf
"In the early 1920s Azerbaijanis invented their own Latin alphabet, but Tatarstan scholars set a little store to this project, preferring to reform the İske imlâ (en.wikipedia.org/wiki/iske_imla). The simplified İske imlâ, known as Yaña imlâ (en.wikipedia.org/wiki/yana_imla) was used from 1920–1927. "-- Wikipedia seems to be the main source of information about these alphabets;
www.topix.com/forum/world/macedonia/TRUO6MJ0V7P8BHR1E/p8 says little about these alphabets, but does bring up Old Tatar which was used in pre-1905 documents according to the info here.
--C. E. Whitehead
cewcathar at hotmail.com
> Then my use case could be tagged as such:
>> * Tatar, transliterated from original Arabic into Latin
> -- OR --
>> * Tatar, transliterated from original Cyrillic into Latin
>> tt-alalc97 [if we're assuming tt-Cyrl]
> -- OR --
> tt-?????-alalc97 [if a new variant tag is assigned for the present
> Tatar Cyrillic orthography, possibly to distinguish it from the
> Krashen Tatar Cyrillic orthography used until ~1939 among Christian
> Tatars, or simply to help us in this specific case.]
> So my thinking is that a careful look at the changes the languages
> themselves underwent will help us find a way out of this mess-- there
> are authentic language variants that bear more meaning than is
> conveyed by the script subtag.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages