Region subtags under 3066 and 3066bis (was: Re: LANGUAGE TAG REGISTRATION FORM: mn-Mong-CN)

Doug Ewell dewell at adelphia.net
Sun Feb 13 21:08:15 CET 2005


Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:

>> Valid 3066 language tags starting with fr- :
>>
>> fr-AA
>> fr-DD
>> fr-NH
>> fr-QM
>> fr-XA
> [etc. up to]
>> fr-XZ
>> fr-ZZ
>
> My copy of RfC 3066 says that you must not use these tags.
> Some like fr-DD or fr-NH are not more mentioned in ISO 3166.

I think Frank is correct.  RFC 3066 defines ISO 3166 as the source for
what we now call "region subtags."  Since code elements like DD and NH
are no longer included in ISO 3166, I don't think they are allowable in
RFC 3066 tags.

RFC 3066 specifies the use of code elements from ISO 3166:1988 "or
subsequently assigned by the ISO 3166 maintenance agency or governing
standardization bodies."  I interpret the word "assigned" to mean both
additions and deletions.  (Opposing views are welcomed.)

Allowing these "withdrawn" code elements in RFC 3066bis tags is an
intentional change, so that changes to ISO 3166 do not invalidate
previously valid language tags.  If you interpret RFC 3066 strictly (and
with my interpretation of "assigned"), the language tag "tet-TP" ("Tetum
as spoken in East Timor") ceased to be valid on 2002-05-20, when the ISO
3166 code for East Timor was changed from TP to TL.  Under RFC 3066bis,
both "tet-TP" and "tet-TL" are valid tags, though the latter is
canonical because TL was the ISO 3166 code as of the cutoff date of
2005-01-01.

To be consistent with this principle, ALL "formerly used" ISO 3166 code
elements are valid region subtags under RFC 3066bis.  Thus fr-DD and
fr-NH are valid under RFC 3066bis (though not under RFC 3066), even
though their withdrawal from ISO 3166 long predates their use in
language tags.

Regarding the "user-assigned" or "private-use" ISO 3166 code elements
(AA, QM-QZ, XA-XZ, and ZZ), these are explicitly disallowed by Section
2.2 of RFC 3066.  Frank is definitely correct on this.  Again, this is
an intentional change in RFC 3066bis.  The idea is that a partially
private-use tag ("fy-XX") is often better than a totally private-use tag
("x-whatever"), since at least some publicly interchangeable information
can be extracted from the former.

> If you want to use them anyway, why no fr-EA, maybe French
> as in Melilla makes sense.  And why no fr-EU if you think
> that fr-NT is okay ?

EA and EU are "reserved" by ISO 3166/MA, but have never actually been
"assigned" code elements.  That is the difference.  RFC 3066bis allows
currently and previously assigned code elements to be used in language
tags, but not reserved code elements.  The rationale is that the MA is
under no obligation to continue to reserve EA for Ceuta and Melilla;
they could always reassign it to, say, East Afghanistan if such an
entity came into existence.  (Indeed, ISO 3166/MA did assign the
reserved alpha-3 code element ROU to a different country from the one
for which it was reserved; we can only be thankful it wasn't an alpha-2
code element.)

If codes such as EA and EU were ever assigned formally by ISO 3166/MA,
they could always be added to the RFC 3066bis registry at that time.

BTW, RFC 3066bis would allow "fr-015" meaning "French as spoken in
Northern Africa," and "fr-150" meaning "French as spoken in Europe,"
possibly meeting the needs envisioned for "fr-EA" and "fr-EU" above.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/




More information about the Ietf-languages mailing list