Tagging transliterations from a specific script

CE Whitehead cewcathar at hotmail.com
Mon Feb 14 21:34:34 CET 2011

Hi.  [sarab] is o.k. (but I don't think it would be listed as [sArab] in the registry, since the variant subtag is technically all lower-case).  I personally want to suggest [transarb] -- if that makes any sense.

Phillips, Addison addison at lab126.com 
Mon Feb 14 17:22:21 CET 2011 
>> I'd suggest a tag for the Turkic languages affected by the
>> introduction of Janalif, before the introduction of the same, but I
>> don't want to cause the same justifiable concern that was raised
>> about
>> my proposed "pre1917" tag on this list last fall. Also, such a tag
>> would really just represent a script, so in most cases it would be
>> equivalent to, e.g., tt-Arab, az-Arab. It only really is needed,
>> then,
>> when the actual script is not Arabic, so tt-Latn-ARABIC (not a real,
>> or legal, subtag). So tt-Arab and tt-ARABIC are completely
>> identical.
> If I understand the problem correctly, you want to distinguish between "tt-alalc97" when
> transliterated from the Arabic script vs. the Cyrillic script. This suggests to me that you want a
> subordinate subtag (following alalc97) rather than trying to repurpose some unrelated but already
> defined subtag value. 
Thanks for restating this.   
This is what I understand too.  (I assume this is what Avram Lyon means.)
So ideally [alalc97] would be registered as the prefix.
However, I suppose we cannot have *-alalc97 registered as the prefix. 
And, if we do not register a prefix with this subtag, it seems we cannot do so later, according to RFC 5646:
"If a record includes no ’Prefix’ field, a ’Prefix’ field MUST NOT be
added to the record at a later date. Otherwise, changes (additions,
deletions, or modifications) to the set of ’Prefix’ fields MAY be
registered, as long as they strictly widen the range of language tags  . . . ""
(If I recollect things you all did decide to continue to not allow wildcards, so the only option is to list all possible prefixes, or else to include information about the ordering of this variant after [alalc] in a comment. )
> For example, you might consider registering a few subtags such as the following:
>      Type:             variant
>       Subtag:           sArab      (this would actually be lowercase in the registry)
One small comment:  I don't think you can use an upper-case A in the variant subtag can you?
My preferences for the name are: [tranarab] or [fromarab] or [transarb] or something similar; the one option I do not like is [arabic], which I find to be a confusing name.
(This is just my personal preference.  Like I said before I am not the expert.)
--C. E. Whitehead
cewcathar at hotmail.com 

>      Description:      transliteration from the Arabic script
>      Prefix:           tt-alalc97 (etc.....)
>      Comments:         transliterated document's source script was Arabic; a document tagged
>      with this subtag will be in the Latin script. Differences in transliteration
>      occur depending on the source script.
> Alternatively, it might be time to consider a transliteration extension to forestall increasingly baroque > subtag collections. Extensions allow for any subtag between 2 and 8 characters and can define their > own rules for legal usage. For example, if 't' were assigned to an extension for transliteration, it 
> might then define subtags to allow a tag like:
>  "tt-alalc97-t-arab" // Tatar transliterated from the Latin script
> Writing an extension turns out not to be very hard. The main problem would be deciding what to put
> in it (which might be an intractable problem).
> Addison
> Addison Phillips
> Globalization Architect (Lab126)
> Chair (W3C I18N, IETF IRI WGs)
> Internationalization is not a feature.
> It is an architecture.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/ietf-languages/attachments/20110214/e41b09a0/attachment.html>

More information about the Ietf-languages mailing list