<html>

<head>

<style><!--

.hmmessage P

{

margin:0px;

padding:0px

}

body.hmmessage

{

font-size: 10pt;

font-family:Tahoma

}

--></style>

</head>

<body class='hmmessage'>

Hi, Avram:<BR>Avram Lyon ajlyon at ucla.edu <BR>Sat Feb 12 23:24:44 CET 2011 <BR>

<BR>> Dear IETF-Languages,<BR>

> I have a set of data available in several forms: Tatar, written in the<BR>> Arabic script (tt-Arab); Tatar, written in the Cyrillic script<BR>> (tt-Cyrl); transliteration of that same text into Latin script. The<BR>> original text is in tt-Arab, so the transliteration (since it follows<BR>> ALA-LC 1997) should certainly be tagged tt-alalc97. That tag, however,<BR>> is precisely what we'd use for a transliteration using the ALA-LC<BR>> system from tt-Cyrl as well. Thus, there's no way to distinguish<BR>> between the two very different representations of the same text (i.e.,<BR>> the ALA-LC system is defined for Arabic scripts and for Cyrillic<BR>> scripts, but the systems lead to very different representations).<BR>

> The real-world case where this arises is in the multilingual version<BR>> of Zotero, the bibliographic data management software. There, we're<BR>> allowing the entry of alternate representations of key fields using<BR>> any valid language tag, which has been great so far. But now we can't<BR>> represent this distinction; it would be something like<BR>> *tt-Arab-alalc97, but subtags aren't supposed to override one another,<BR>> just refine each other.<BR>

If it's for a transliteration into Latin script then how would you tag it tt-Arab . . . ?<BR>

(I'm sorry to ask a dumb question.)<BR>

> I think it might be appropriate to introduce a variant subtag for<BR>> Tatar in the Arabic script, which was used until the introduction of<BR>You mean a variant to indicate the Romanization of Tatar that was originally written in the Arabic script . . . ?<BR>> Janalif in 1927-1928 (tt-Latn, tt-baku1926), but I'd be glad to hear<BR>> other options for distinguishing these data.<BR>

> Regards,<BR>

> Avram<BR>

 <BR>

 <BR>

I'm not sure that I completely understand the request (my apologies). <BR>

Another option is to use metadata and certainly perhaps the text date would provide a clue as to the original script (if that's what you are asking for:  a way to distinguish the original script).<BR>

However I personally have no objection to having two variants indicating two distinct ala-lc romanizations,<BR>

but I hope we will hear from a few others regarding this matter (I am not the expert in ala-lc romanizations).<BR>

 <BR>

In any case<BR>

[alalc97] is not just for Tatar, is it? (let me know if it is)<BR>

 <BR>

See:<BR>

<A href="http://www.iana.org/assignments/lang-subtags-templates/alalc97">http://www.iana.org/assignments/lang-subtags-templates/alalc97</A><BR>

 <BR>

 <BR>

So would other Romanizations from Arabic script (from other languages) fit into your scheme?  <BR>

 <BR>

Best,<BR>

 <BR>

--C. E. Whitehead<BR>

<A href="mailto:cewcathar@hotmail.com">cewcathar@hotmail.com</A> <BR>                                         </body>

</html>