New Last Call: 'Tags for Identifying Languages' to BCP

John Cowan jcowan at
Sun Dec 12 21:34:44 CET 2004

Bruce Lilly scripsit:

> Moreover, the point is that countries do change, and that use
> of country codes (as provided for in RFC 3066 and in the
> proposed draft) carries with it the inherent instability
> which is characteristic of politics.  A quest for "stability"
> of countries seems Quixotic and oxymoronic.  

Of course countries change, and then the numeric country codes change
as well.  The point is that the alpha codes change for political reasons
when there has been *no* change in the underlying country:  Romania's
3-alpha code changed from ROM to ROU without any change in Romania at all.
The CS case is particularly gratuitous, as its denotation changed from
"Czechoslovakia" (a no longer existent country) to "Serbia and Montenegro"
(a newly created country).

> A related problem with the use of country codes in language
> tags is that there is not necessarily an inherent relationship
> between language and country borders.  

Of course not.  But for the most part, variations in orthography
do tend to follow national boundaries, since orthography in many
languages is either de jure or de facto a national matter.

> As far as I can tell,
> the draft doesn't really deal with the issue of changing borders
> or changing country names -- it merely pretends that these
> things don't happen by attempting to declare a snapshot of the
> status at some point in time as being valid for all time.

No, it attempts to freeze the code-to-country mapping at a single
point.  New countries or changes in old countries should involve only the
additions of codes, not the reuse of old codes.

> Where is the implementor supposed to get the *official*
> translation for display?  

I don't know.  Where is the implementor supposed to get the
official German, or Catalan, or Mandarin translations?
Not in the ISO registry, for sure.  To say nothing of the
cases where no official translations exist.

> > There are 6000 languages spoken on Earth, of which 
> > perhaps 600 have a standard written form.
> ISO 639 lists about 650, not precisely 6000.

ISO 639-2 is deliberately incomplete.  The current draft of ISO 639-3,
which is not yet an IS, lists over 7000 languages.

> It might be worthwhile considering the differences in the
> way languages tags are used, by whom they are used, and for
> what purpose.  There may well be a substantial difference
> between use of a tag to represent an obscure dialect of a
> dead language in a research paper vs. tagging a piece of
> text in one of the core Internet protocols such as SMTP.

That count does not include dead languages.  Whether it includes
dialects is a matter of terminology.

Deshil Holles eamus.  Deshil Holles eamus.  Deshil Holles eamus.
Send us, bright one, light one, Horhorn, quickening, and wombfruit. (3x)
Hoopsa, boyaboy, hoopsa!  Hoopsa, boyaboy, hoopsa!  Hoopsa, boyaboy, hoopsa!
  -- Joyce, Ulysses, "Oxen of the Sun"       jcowan at

More information about the Ietf-languages mailing list