New Last Call: 'Tags for Identifying Languages' to BCP

Mark Davis mark.davis at
Sat Dec 11 00:34:47 CET 2004

As Peter says, there are many inaccuracies in Bruce's message. After
scanning through his points and discarding all of the inaccurate ones, I see
no substantive technical issues. It appears that much of the problem was
working with an old version of the document. In addition to what Peter said,
I'll add one more:

>> Accessibility has not been a problem for this implementor...
> I agree with Bruce, that accessibility of ISO 639 and ISO 3166 has not

The issue could have been more clearly stated in the text; the point is
1. Because of the lack of stability of the underlying standards, what is a
conformant tag at any given time may change (and in particular a conformant
tag may become unconformant -- and *has* in the past).
2. So if I need to establish that string X was conformant to RFC 3066 as of
a particular date, to see if I am interoperable with another implementation
that is conformant as of that date, then I need to reconstruct the status of
the underlying standards as of that date.
3. Reconstructing the status of each underlying standard as of a given date
is very clumsy and error-prone; accessibility is in fact a significant


There is one other point is about ISO 8601 format; I agree with Bruce about
problems with it. What we should have made clear is that the date format was
a subset of 8601, and that it would be completely defined by the pattern
YYYY-MM-DD. (That way people wouldn't have to get a copy of the standard,
and could easily parse the subset that we actually use.)


----- Original Message ----- 
From: "Peter Constable" <petercon at>
To: <ietf-languages at>
Sent: Friday, December 10, 2004 13:37
Subject: RE: New Last Call: 'Tags for Identifying Languages' to BCP

Bruce Lilly's message makes several inaccurate statements against the
proposed draft, and misrepresents some of the changes being made. My
main concern is that I don't know where it was circulated. I might be
wrong, but I get the impression it was written with a different audience
in mind and then copied here.

> -----Original Message-----

> > There are problems with the the RFC 3066 definition of generative
> > however. The ISO 639 and ISO 3166 standards are not freely available
and evolve
> > over time.
> Accessibility has not been a problem for this implementor...

I agree with Bruce, that accessibility of ISO 639 and ISO 3166 has not
been the issue. Unfortunately, his comments do not indicate what the
real issues were.

> > The largest change in the specification is that it modifies the
structure of
> > the language tag registry. Instead of having to obtain lists of
codes from five
> > separate external standards...

> Contrary to the implicit claim, the ISO documents mentioned
> above comprise two standards (available in two languages each),
> not "five separate external standards".

RFC 3066 made reference to ISO 639-1, ISO 639-2 and ISO 3166-1; the
proposed replacement adds ISO 15924. I would count that as four ISO
standards. Up-to-date code tables for all four are readily available.

> The availability of those two definitive standards in bilingual
> forms allows implementors to (for example) construct menus of
> available language and country code tags in BOTH languages used
> in ISO standards.  The draft proposes declaring those standards
> effectively irrelevant, being replaced by a single monolingual
> (English) IANA registry. While it has become fashionable in
> recent years among some factions within the United States
> to bash France, the French people, their culture, and their
> language, it seems inappropriate to extend such bashing to
> technical standards which supposedly apply in an international
> context. Especially when dealing with the subject matter of
> language itself. The unavailability of the registered value
> "description" in 50% of the languages traditionally used for
> international standards publication, including the existing ISO
> 639 and 3166 codes, is a serious defect in the proposal, and
> a departure from the status quo under RFC 3066...

I think this is a serious misrepresentation of the intent of the
proposal: the draft nowhere suggests, let alone declares, that the
source ISO standards are irrelevant. Rather, the intent of the
comprehensive registry is to ensure stability in IETF implementations by
protecting them from unpredictable changes in ISO standards, such as the
re-definition of "CS" as a country identifier not long ago. The
denotation of identifiers listed in the registry is based on their
definition in the ISO standards, not on an informative descriptor
provided in the registry; and as Bruce quite clearly pointed out, those
source standards are readily accessible. So the suggestion that
implementers will no longer have access to French-language names from
the source ISO standards simply is vacuous.

As for concerns of Anglo-centricity, I'm sure that the authors had no
anti-French motive, and would be open to suggestions as to how that
could be addressed. Surely, though, this is not a technical argument
against the proposal.

> The ABNF in the draft permits all of the following tags which
> are not legal per the RFC 3066 ABNF:
>    supercalifragilisticexpialidoceus
>    y-----
>    x1234567890abc
>    a123-xyz

In fact, none of these is permitted by the ABNF of the draft.

> Specifically, the draft allows, and RFC 3066 disallows:
>    subtags more than 8 octets in length

This is incorrect. It was true of an earlier draft, but that was

>    hyphens which do not separate subtags
>    zero-length subtags

These near-equivalent statements are incorrect. No hyphen may be
permitted without a non-initial sub-tag, and no sub-tag can be an empty

>    primary tags which are not purely alphabetic

This is incorrect. A primary sub-tag must be 2*3ALPHA or 4*8ALPHA, or
"i" or "x".

I need to break off now. More comments later.

Peter Constable
Microsoft Corporation
Ietf-languages mailing list
Ietf-languages at

More information about the Ietf-languages mailing list