Language Identifier List Comments, updated

Doug Ewell dewell at
Mon Dec 27 09:20:41 CET 2004

JFC (Jefsey) Morfin <jefsey at jefsey dot com> wrote:

> I gave some thinking to all this and reviewed the documents that W3C
> also prepare. I am afraid we want to put too many unrelated things
> into the same debate, due to a confusion between the three
> internationalization, multilingualization and vernacularization layers
> wich are not identifed and documented yet, while some attempt to
> discuss what belongs to lingual authoritative sources.

Unfortunately, Jefsey is talking past me again, but I think there may be
some confusion between the draft (RFC 3066bis) and Tex's page.

Tex is putting together an informative Web page, attempting to identify
which language tags can reasonably be considered "complete" by
themselves and which need to be qualified by a region subtag.  For
example, according to the page, "ca" for Catalan is enough information;
there is no reason to qualify it with a region subtag, as "ca-ES",
because Catalan is Catalan regardless of where spoken, or because it is
only spoken in Spain.  On the other hand, "es" needs to be qualified as
"es-ES", "es-MX", "es-AR", or whatever, because those variants of
Spanish differ.

That's it.  Tex is not trying to create an IETF or W3C document, or
express anything normative.  So we need to be careful to specify which
document we are commenting on.

> As for naming, languages are chosen and documented by the local
> internet communities, represented by their Trustees, the ccTLD
> Managers (the SLD Manager for privately defined tags). The same as
> IANA is not in the business of defining countries (RFC 1591), IANA is
> not in the business of defining the languages of the countries.

NOBODY is trying to make IANA do this.  This has been said before, and
apparently needs to be said again.

RFC 3066bis defines *codes* for languages and for "regions," which are
not even necessarily countries (most of the U.N.-based numeric codes are
for geographic regions such as "Europe" or "Western Africa").  The codes
come from ISO and U.N. sources, and none are invented by any agent of
IANA or designee of RFC 3066bis except to resolve conflicts created by
those sources.  The RFC never defines the *entities* associated with
these codes, and it is very clear about where the definitions do come
from (generally the U.N.).

RFC 3066bis also does not make any attempt to define which languages are
(or should be) used in any given country or region.

Tex's page does attempt to determine (not dictate) language usage, for
purposes of recommending tag usage or documenting preferred usage, but
again, his page has nothing to do with IANA.

> All what an _RFC_ can say is that language tags identify the IDNA
> Tables published by the ccTLD Manager, as the Trustee of his local
> internet community (we talk of the language used by network/protocol
> related issues). Or by the SLD Managers for their domain. I certainly
> favor Unicode, locales, contexts, etc. converge, but that rises first
> many many more multilingual Internet related issues, the RFC 3066bis
> does not want to discuss.

As Tex said, language tags are used for much more than IDNA.  And once
again, I fail to see what Unicode has to do with any of this.

> I fully understand that most of the ccTLD Managers have not published
> language tables and that other applications than DNS call for an
> immediate support, alaso that SLD Manager may need off-the-shelves
> tables. However this support by non-ccTLD Managers can only be
> temporary and MUST be eventually consistent with the ccTLD Manager
> tables such an RFC should call for. Otherwise we have a real layer and
> autority violation, all the more than this is not only by RFC 1591,
> ICANN ICP-1 but also by the WSIS 2003 Resolutions underlinging the
> sovereignty of Govs over ccTLDs. There is no problem in documenting
> the duties of a ccTLD Manager in this area and in discussing it with
> ccTLDs Managers, as an addition to the ccTLD Manager BPs.

This is way out of scope for RFC 3066bis or any of its predecessors.

> I would therefore review the ABNF in four areas:
> - favoring the three letter codes for the language to make this entry
> time independent and consistent (this does not change anything in the
> currenet applications)

No.  RFC 3066bis is not going to break interoperability with RFC 3066 by
switching to alpha-3 language codes for all languages, forcing users to
replace "en" with "eng", "fr" with "fra", and so forth.  This is simply
not going to happen.

> - introduce the quted language icon URL
> - the URL of the corresponding table
> - a possible comment on the orientation of the table.
> I would add a paragraph indicating that languages tags actually
> designate the language tables decided by the local internet
> communities through they ccTLD Manager. Underlining that there is as
> many language tags in use as supported, requested or prepared tables.

I have no idea what any of this refers to.

-Doug Ewell
 Fullerton, California

More information about the Ietf-languages mailing list