Language tags and (localization) processes (Re: [Ltru] draft-davis-t-langtag-ext)

Tue Jul 12 09:23:36 CEST 2011

The current draft states

"Language tags, as defined by
[BCP47<http://tools.ietf.org/html/draft-davis-t-langtag-ext-02#ref-BCP47>],
are useful for identifying the

   language of content.  There are mechanisms for specifying variant
   subtags for special purposes.  However, these variants are
   insufficient for specifying text transformations, including content

that has been transliterated, transcribed, or translated."

I am requesting a clarification from the editors, that includes a liaison
with the Unicode ULI TC http://uli.unicode.org/ , and a clarification in the
draft.

Language tags so far have described *states*: an object is in a language, a
script etc. The proposed extension extends languages to describe the outcome
of a *process*: objects have been transformed, with a source object as the
basis for this process. According to the paragraph above, this
transformation includes also translation.

So far formats like TBX, XLIFF or others have been used for aligning source
and target contents. These formats also use language tags, via xml:lang.
However, the transformation, i.e. the process information, is not expressed
via the language tag, but via XML structures (pairs of source and target
elements). The language tags are purely for identifying the state of an
object.

To avoid confusion for users of the above and other, process related formats
about where to put language identification information and where to put
process related information, I am asking you to
1) Liaise with the ULI TC about the issue described above and see what
issues they see here
2) Document the outcome of this liaison on this list and in the draft
There is no need to have long explanations in the draft, but guidance about
the topic will be very helpful to avoid confusion.

As a side note, formats like TBX, XLIFF and others reduce the usage of a
language tag for good reasons: information related to processes like
translation can be very complex, e.g. expressing translation state, cycle,
quality. So I have the general concern that language tags might be
overloaded with key value pairs in areas that would require more complex
information and that potentially overlap with formats that provide that
information. Nevertheless I won't object against moving this extension
forward, if the concerns are explained properly in the draft.

Felix

2011/7/12 Mark Davis ☕ <mark at macchiato.com>

> We've posted a new version of
> http://tools.ietf.org/html/draft-davis-t-langtag-ext
>
> Diffs are here:
> http://tools.ietf.org/rfcdiff?url2=draft-davis-t-langtag-ext-02.txt
>
> The changes are:
>
> * Made it clear that application to the case of speech was included, added
> Peter C's example.
> * Fixed references, adding authors, removing unneeded reference.
> * Changed ABNF. Mostly just the table form, but also defined alphanum.
> * Made it clear that the CLDR committee must post proposals publicly.
> * Added more information on the XML structure, including the description
> attribute. (Note that the CLDR committee had decided to add the description
> attribute before this process began.)
> * Added fixes for typos noted by CEW.
>
> Please let us know of further feedback.
>
> Note to Doug: The CLDR committee had agreed to move the descriptions into
> the bcp47 files, such as
> http://unicode.org/repos/cldr/trunk/common/bcp47/calendar.xml. Yoshito has
> the action to do that, and was able to accelerate it. So please take a look
> if you have the time.
>
> Mark
>
>
>
> _______________________________________________
> Ltru mailing list
> Ltru at ietf.org
> https://www.ietf.org/mailman/listinfo/ltru
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/ietf-languages/attachments/20110712/f4338568/attachment.html>