New Last Call: 'Tags for Identifying Languages' to BCP

Peter Constable petercon at microsoft.com
Fri Dec 10 22:37:11 CET 2004


Bruce Lilly's message makes several inaccurate statements against the
proposed draft, and misrepresents some of the changes being made. My
main concern is that I don't know where it was circulated. I might be
wrong, but I get the impression it was written with a different audience
in mind and then copied here.



> -----Original Message-----

> > There are problems with the the RFC 3066 definition of generative
tags,
> > however. The ISO 639 and ISO 3166 standards are not freely available
and evolve
> > over time.
> 
> Accessibility has not been a problem for this implementor...

I agree with Bruce, that accessibility of ISO 639 and ISO 3166 has not
been the issue. Unfortunately, his comments do not indicate what the
real issues were.


> > The largest change in the specification is that it modifies the
structure of
> > the language tag registry. Instead of having to obtain lists of
codes from five
> > separate external standards...

> Contrary to the implicit claim, the ISO documents mentioned
> above comprise two standards (available in two languages each),
> not "five separate external standards".

RFC 3066 made reference to ISO 639-1, ISO 639-2 and ISO 3166-1; the
proposed replacement adds ISO 15924. I would count that as four ISO
standards. Up-to-date code tables for all four are readily available.

 
> The availability of those two definitive standards in bilingual
> forms allows implementors to (for example) construct menus of
> available language and country code tags in BOTH languages used
> in ISO standards.  The draft proposes declaring those standards
> effectively irrelevant, being replaced by a single monolingual
> (English) IANA registry. While it has become fashionable in
> recent years among some factions within the United States
> to bash France, the French people, their culture, and their
> language, it seems inappropriate to extend such bashing to
> technical standards which supposedly apply in an international
> context. Especially when dealing with the subject matter of
> language itself. The unavailability of the registered value
> "description" in 50% of the languages traditionally used for
> international standards publication, including the existing ISO
> 639 and 3166 codes, is a serious defect in the proposal, and
> a departure from the status quo under RFC 3066...

I think this is a serious misrepresentation of the intent of the
proposal: the draft nowhere suggests, let alone declares, that the
source ISO standards are irrelevant. Rather, the intent of the
comprehensive registry is to ensure stability in IETF implementations by
protecting them from unpredictable changes in ISO standards, such as the
re-definition of "CS" as a country identifier not long ago. The
denotation of identifiers listed in the registry is based on their
definition in the ISO standards, not on an informative descriptor
provided in the registry; and as Bruce quite clearly pointed out, those
source standards are readily accessible. So the suggestion that
implementers will no longer have access to French-language names from
the source ISO standards simply is vacuous. 

As for concerns of Anglo-centricity, I'm sure that the authors had no
anti-French motive, and would be open to suggestions as to how that
could be addressed. Surely, though, this is not a technical argument
against the proposal.


> The ABNF in the draft permits all of the following tags which
> are not legal per the RFC 3066 ABNF:
>    supercalifragilisticexpialidoceus
>    y-----
>    x1234567890abc
>    a123-xyz

In fact, none of these is permitted by the ABNF of the draft.


> Specifically, the draft allows, and RFC 3066 disallows:
>    subtags more than 8 octets in length

This is incorrect. It was true of an earlier draft, but that was
changed.

>    hyphens which do not separate subtags
>    zero-length subtags

These near-equivalent statements are incorrect. No hyphen may be
permitted without a non-initial sub-tag, and no sub-tag can be an empty
string.

>    primary tags which are not purely alphabetic

This is incorrect. A primary sub-tag must be 2*3ALPHA or 4*8ALPHA, or
"i" or "x".


I need to break off now. More comments later.


Peter Constable
Microsoft Corporation


More information about the Ietf-languages mailing list