Tagging transliterations from a specific script

Phillips, Addison addison at lab126.com
Mon Feb 14 17:22:21 CET 2011


> 
> I'd suggest a tag for the Turkic languages affected by the
> introduction of Janalif, before the introduction of the same, but I
> don't want to cause the same justifiable concern that was raised
> about
> my proposed "pre1917" tag on this list last fall. Also, such a tag
> would really just represent a script, so in most cases it would be
> equivalent to, e.g., tt-Arab, az-Arab. It only really is needed,
> then,
> when the actual script is not Arabic, so tt-Latn-ARABIC (not a real,
> or legal, subtag). So tt-Arab and tt-ARABIC are completely
> identical.

If I understand the problem correctly, you want to distinguish between "tt-alalc97" when transliterated from the Arabic script vs. the Cyrillic script. This suggests to me that you want a subordinate subtag (following alalc97) rather than trying to repurpose some unrelated but already defined subtag value. 

For example, you might consider registering a few subtags such as the following:

      Type:             variant
      Subtag:           sArab      (this would actually be lowercase in the registry)
      Description:      transliteration from the Arabic script
      Prefix:           tt-alalc97 (etc.....)
      Comments:         transliterated document's source script was Arabic; a document tagged
          with this subtag will be in the Latin script. Differences in transliteration
          occur depending on the source script.

Alternatively, it might be time to consider a transliteration extension to forestall increasingly baroque subtag collections. Extensions allow for any subtag between 2 and 8 characters and can define their own rules for legal usage. For example, if 't' were assigned to an extension for transliteration, it might then define subtags to allow a tag like:

  "tt-alalc97-t-arab" // Tatar transliterated from the Latin script

Writing an extension turns out not to be very hard. The main problem would be deciding what to put in it (which might be an intractable problem).

Addison

Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N, IETF IRI WGs)

Internationalization is not a feature.
It is an architecture.





More information about the Ietf-languages mailing list