Mark Davis mark.davis at
Sat Feb 12 19:18:36 CET 2005

I agree completely with Peter.

I want to point out that there are currently 338,345 valid language tags
according to RFC 3066. The great value of RFC 3066 is that people *didn't*
have to go through a registration process for the overwhelming majority of
these cases. The tags are needed for distinctions in information technology
can be used; those that are not will simply either not get used or get
fallback behavior if they are used outside of a domain where people care to
make the distinction. That means that if someone wants to use a tag like
fr-MX to capture a particular distinction that is important in their domain,
they are free to do so. Nobody has shown any particular harm to the fact
that all the varieties of fr-* listed at the end of this message are all
valid 3066 language tags, for example.

Now, the particular registrations that were proposed recently by Microsoft
and IBM will become redundant once we can get 3066bis through the process.
But since 3066bis is taking far longer than anticipated, in the interim we
must move forward on the registrations that have real, immediate business
requirements. We held off on these for a long time, but can't wait longer.

Clearly nobody is doing this on a whim: There is real industry need for
distinguishing the usage of particular language/country combinations
according to script. We need, for example, to distinguish Chinese used in
Hong Kong, for example, according to simplified vs traditional script, as do
a great many other businesses and other user communities. If RFC 3066 and
its successors are to continue to be used as the standard for language tags,
they have to be responsive in making the distinctions that meet the needs of
the IT industry.



Valid 3066 language tags starting with fr- :

fr-AA, fr-AD, fr-AE, fr-AF, fr-AG, fr-AI, fr-AL, fr-AM, fr-AN, fr-AO, fr-AQ,
fr-AR, fr-AS, fr-AT, fr-AU, fr-AW, fr-AX, fr-AZ, fr-BA, fr-BB, fr-BD, fr-BE,
fr-BF, fr-BG, fr-BH, fr-BI, fr-BJ, fr-BM, fr-BN, fr-BO, fr-BQ, fr-BR, fr-BS,
fr-BT, fr-BU, fr-BV, fr-BW, fr-BY, fr-BZ, fr-CA, fr-CC, fr-CD, fr-CF, fr-CG,
fr-CH, fr-CI, fr-CK, fr-CL, fr-CM, fr-CN, fr-CO, fr-CR, fr-CS, fr-CT, fr-CU,
fr-CV, fr-CX, fr-CY, fr-CZ, fr-DD, fr-DE, fr-DJ, fr-DK, fr-DM, fr-DO, fr-DY,
fr-DZ, fr-EC, fr-EE, fr-EG, fr-EH, fr-ER, fr-ES, fr-ET, fr-FI, fr-FJ, fr-FK,
fr-FM, fr-FO, fr-FQ, fr-FR, fr-FX, fr-GA, fr-GB, fr-GD, fr-GE, fr-GF, fr-GH,
fr-GI, fr-GL, fr-GM, fr-GN, fr-GP, fr-GQ, fr-GR, fr-GS, fr-GT, fr-GU, fr-GW,
fr-GY, fr-HK, fr-HM, fr-HN, fr-HR, fr-HT, fr-HU, fr-HV, fr-ID, fr-IE, fr-IL,
fr-IN, fr-IO, fr-IQ, fr-IR, fr-IS, fr-IT, fr-JM, fr-JO, fr-JP, fr-JT, fr-KE,
fr-KG, fr-KH, fr-KI, fr-KM, fr-KN, fr-KP, fr-KR, fr-KW, fr-KY, fr-KZ, fr-LA,
fr-LB, fr-LC, fr-LI, fr-LK, fr-LR, fr-LS, fr-LT, fr-LU, fr-LV, fr-LY, fr-MA,
fr-MC, fr-MD, fr-MG, fr-MH, fr-MI, fr-MK, fr-ML, fr-MM, fr-MN, fr-MO, fr-MP,
fr-MQ, fr-MR, fr-MS, fr-MT, fr-MU, fr-MV, fr-MW, fr-MX, fr-MY, fr-MZ, fr-NA,
fr-NC, fr-NE, fr-NF, fr-NG, fr-NH, fr-NI, fr-NL, fr-NO, fr-NP, fr-NQ, fr-NR,
fr-NT, fr-NU, fr-NZ, fr-OM, fr-PA, fr-PC, fr-PE, fr-PF, fr-PG, fr-PH, fr-PK,
fr-PL, fr-PM, fr-PN, fr-PR, fr-PS, fr-PT, fr-PU, fr-PW, fr-PY, fr-PZ, fr-QA,
fr-QM, fr-QN, fr-QO, fr-QP, fr-QQ, fr-QR, fr-QS, fr-QT, fr-QU, fr-QV, fr-QW,
fr-QX, fr-QY, fr-QZ, fr-RE, fr-RH, fr-RO, fr-RU, fr-RW, fr-SA, fr-SB, fr-SC,
fr-SD, fr-SE, fr-SG, fr-SH, fr-SI, fr-SJ, fr-SK, fr-SL, fr-SM, fr-SN, fr-SO,
fr-SR, fr-ST, fr-SU, fr-SV, fr-SY, fr-SZ, fr-TC, fr-TD, fr-TF, fr-TG, fr-TH,
fr-TJ, fr-TK, fr-TL, fr-TM, fr-TN, fr-TO, fr-TP, fr-TR, fr-TT, fr-TV, fr-TW,
fr-TZ, fr-UA, fr-UG, fr-UM, fr-US, fr-UY, fr-UZ, fr-VA, fr-VC, fr-VD, fr-VE,
fr-VG, fr-VI, fr-VN, fr-VU, fr-WF, fr-WK, fr-WS, fr-XA, fr-XB, fr-XC, fr-XD,
fr-XE, fr-XF, fr-XG, fr-XH, fr-XI, fr-XJ, fr-XK, fr-XL, fr-XM, fr-XN, fr-XO,
fr-XP, fr-XQ, fr-XR, fr-XS, fr-XT, fr-XU, fr-XV, fr-XW, fr-XX, fr-XY, fr-XZ,
fr-YD, fr-YE, fr-YT, fr-YU, fr-ZA, fr-ZM, fr-ZR, fr-ZW, fr-ZZ

----- Original Message ----- 
From: "Peter Constable" <petercon at>
To: <ietf-languages at>
Sent: Saturday, February 12, 2005 01:39

> From: Michael Everson [mailto:everson at]

> >  > On the basis of established precedent, I don't see how the
> documentation
> >>  provided can be considered insufficient.
> >
> >I think this is a variant of the existing complaint: the references have
> to
> >document the use of Mongolian *in China using Mongolian script*.
> The banknotes do that.
> But it does not differ from Mongolian writing in Mongolia.

It is a *serious* mistake to go from a statement like "We do not know of any
difference between Mongolian used in Mongolia versus Mongolia used in China"
to the conclusion "there is no need for country IDs CN and MN in language
tags for Mongolian". I will repeat what I wrote on Feb 3:

"The important thing for us is not to establish precisely what every
distinction is (an endless task involving an ever-changing domain over which
different interpretations are possible), but rather to ensure that the
intended meaning of any tag is understood by all and for which it is clear,
to some minimal level, how to utilize it."

To elaborate on *why* that is the case, language tags need to distinguish
not just differences between those abstract entities out in the real world
that we call "languages", but rather they need to distinguish **uses** of
language in content and all kinds of digital language resources.

This is the whole point behind the debate we had over "es-americas": the
point wasn't whether there was an identifiable dialect corresponding to that
tag; rather, the point was that there are scenarios in which language
resources have a linguistic property that *that* tag reflects and that need
to be distinguished from other language resources.

In the Mongolian case, we cannot dictate that nobody should ever have, say,
a terminology database in which Mongolian terms used in China are distinct
from Mongolian terms for the same concepts used in Mongolia. In other words,
we may not know of any linguistic difference between "mn-Mong-MN" and
"mn-Mong-CN", but we absolutely cannot assert that there can never be a need
for someone to distinguish language resources using "mn-Mong-MN" and

To take another example, from a descriptive-linguistic perspective, I
wouldn't expect there to be any difference between fr-CI and fr-GH. But if
(e.g.) in some obscure commercial domain there is some difference between a
term used for a concept in Côte d'Ivoire and Ghana, then there is a
legitimate reason for the use of fr-CI and fr-GH in tagging content, and we
absolutely cannot assert that such situations cannot exist.

I fully appreciate the concerns of linguistic purity, being a linguist
myself, but we are not doing descriptive linguistics here, we are doing IT
implementation; and over-zealous application of descriptive-linguistic
ideals can lead us into incorrect thinking. Language tags are not intended
to be documentation of human knowledge of languages; they are intended as
metadata elements for distinguishing linguistic properties of language
resources and general linguistic content.

Peter Constable
Ietf-languages mailing list
Ietf-languages at

More information about the Ietf-languages mailing list