Swiss german, spoken

Peter Constable petercon at microsoft.com
Wed Jun 15 06:56:51 CEST 2005


> From: JFC (Jefsey) Morfin [mailto:jefsey at jefsey.com]


> >Incorrect. Issues discussed on this list relate to registration of
tags
> >"for the identification of languages" -- that is, tags to be used as
> >metadata elements to declare the linguistic properties of content in
> >Internet and other protocols and applications. There is nothing
stated
> >anywhere that these tags necessarily apply only to text content.
> 
> Except that to register a language you must provide printed references
....

The relevant field on the form is:

"Reference to published description of the language (book or article)"

Note that the reference is to a *description* of the language, not
necessarily a work in the language itself. Thus, one could provide
references to descriptions of (say) American Sign Language, which is not
commonly written.

Several years ago on the IETF-languages list, there was some discussion
of what kinds of materials could be referred to. I had just the opposite
concern: someone might need a tag for a lesser-known language and not be
able to provide references to a description of the language. There was
consensus that references could be to a *description* of the language,
or to a work *in* the language. 


> I note your "eclare the linguistic properties of content" which is
> someting I could agree with. But which is not exactly the wording of
> the document you refer to.

No, it's not the wording of the RFC, but I very much feel the
appropriate characterization of "language tags" is that they primarily
function to declare attributes. (This distinguishes them, in my mind,
from locale identifiers, which primarily function as API parameters used
to tailor culture-dependent processes.) I'd include linguistic
attributes, in the primary sense of that term, but also include
attributes related to the written form -- script, orthography, spelling,
transcription, transliteration -- in the case of textual content. But,
not all content need be textual, the system should facilitate tagging of
linguistic content regardless of the mode of expression.


 
> Due to the impact of ISO documents in the langtag registration process
and
> of their parallel evolution agreed by everyone (even if the nature of
the
> evolution may be different depending on the person) it is advisable to
> read
> ISO 639-1, -2, the drafts of -3, -4, -5, -6 you might find, ISO 15924
and
> ISO 3166. For those wanting to understand the possible future
conflicts
> concerning the registrations discussed here they should consult ISO
11179
> (scalability, updates, nature of the documented information, etc.).

It certainly isn't a bad idea to be familiar with the 639, 15924 and
3166 standards. For 639, there's no particular point going looking for
parts 4, 5 or 6 at this time since there isn't a complete working draft
of any of them, and there is no immediate plan to have any of them
impinge on RFC 3066 or some successor thereof. 

ISO 11179 takes rather a deeper level of interest and commitment. It's a
six-part compendium on metadata elements and registries and metamodels
for metadata elements. The IANA registry for language tags which is the
focus of this list has never been considered an implementation of this
ISO standard, and knowledge of this ISO standard is not a prerequisite
to making useful contributions to the work of this list. Familiarity
with ISO 11179 certainly wouldn't get in the way of contributing to this
list -- unless one begins to behave as though others on this list are or
ought to be familiar with it as well.



Peter Constable


More information about the Ietf-languages mailing list