Tagging transliterations from a specific script
cewcathar at hotmail.com
Mon Feb 14 00:22:57 CET 2011
Avram Lyon ajlyon at ucla.edu
Sat Feb 12 23:24:44 CET 2011
> Dear IETF-Languages,
> I have a set of data available in several forms: Tatar, written in the
> Arabic script (tt-Arab); Tatar, written in the Cyrillic script
> (tt-Cyrl); transliteration of that same text into Latin script. The
> original text is in tt-Arab, so the transliteration (since it follows
> ALA-LC 1997) should certainly be tagged tt-alalc97. That tag, however,
> is precisely what we'd use for a transliteration using the ALA-LC
> system from tt-Cyrl as well. Thus, there's no way to distinguish
> between the two very different representations of the same text (i.e.,
> the ALA-LC system is defined for Arabic scripts and for Cyrillic
> scripts, but the systems lead to very different representations).
> The real-world case where this arises is in the multilingual version
> of Zotero, the bibliographic data management software. There, we're
> allowing the entry of alternate representations of key fields using
> any valid language tag, which has been great so far. But now we can't
> represent this distinction; it would be something like
> *tt-Arab-alalc97, but subtags aren't supposed to override one another,
> just refine each other.
If it's for a transliteration into Latin script then how would you tag it tt-Arab . . . ?
(I'm sorry to ask a dumb question.)
> I think it might be appropriate to introduce a variant subtag for
> Tatar in the Arabic script, which was used until the introduction of
You mean a variant to indicate the Romanization of Tatar that was originally written in the Arabic script . . . ?
> Janalif in 1927-1928 (tt-Latn, tt-baku1926), but I'd be glad to hear
> other options for distinguishing these data.
I'm not sure that I completely understand the request (my apologies).
Another option is to use metadata and certainly perhaps the text date would provide a clue as to the original script (if that's what you are asking for: a way to distinguish the original script).
However I personally have no objection to having two variants indicating two distinct ala-lc romanizations,
but I hope we will hear from a few others regarding this matter (I am not the expert in ala-lc romanizations).
In any case
[alalc97] is not just for Tatar, is it? (let me know if it is)
So would other Romanizations from Arabic script (from other languages) fit into your scheme?
--C. E. Whitehead
cewcathar at hotmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages