Region subtags under 3066 and 3066bis

Doug Ewell dewell at adelphia.net
Thu Feb 17 17:53:01 CET 2005


Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:

>> EA and EU are "reserved" by ISO 3166/MA, but have never
>> actually been "assigned" code elements.  That is the
>> difference.
>
> Okay, that makes sense.  So I can use the (green) "officially
> assigned" country codes in RfC 3066 laguage tags as found on...
>
> ...but not the (gray) "transitionally reserved" codes.  That
> kills also fr-NT and any other ??-NT, good riddance.  I've no
> problem with keeping ??-TP in a hypothetical 3066bis based on
> a cut-off date, as long as it contains a complete frozen list.

The list is semi-frozen, in the sense that there can always be "new old"
entries.  If a country changes its name and ISO 3166/MA changes the
corresponding code element, the region subtag derived from the previous
code element would still be available (indeed, canonical) in addition to
the new subtag.

The list is complete is the sense that it includes all currently and
previously assigned ISO 3166 code elements, except those that have been
reused.  For example, there is a subtag AI for Anguilla, so there could
not also be another subtag AI for French Afars and Issas.

> At the moment (RfC 3066) ??-TP is unfortunately obsolete.  It
> would be nice, if a future 3066bis allows existing ccTLDs (in
> addition to 3166 country codes), that would cover cases like
> TP and YU, and also some (yellow) "exceptionally reserved" AC,
> GG, IM, JE, EU, and UK. fr-GG and fr-JE are even no nonsense.

Although ccTLDs are based on ISO 3166, they are not relevant to the
issue of determining the regional scope of language usage.  TP and YU
are allowable in RFC 3066bis because they are formerly used ISO 3166
code elements, and thus previously admissible in language tags, not
because they are ccTLDs.

AC (Ascension Island) is covered by SH (Saint Helena).

GG (Guernsey) and JE (Jersey) are covered by 830 (Channel Islands); if
anyone spots a taggable difference between Guernsey French and Jersey
French, it can be addressed at that time.

IM (Isle of Man) is covered by 833.

EU (European Union) is covered, sort of, by 150 (Europe).  The political
association is not 1-to-1, but one might be left to ask what is meant by
"X as spoken in the European Union" in any event.

UK (United Kingdom) is covered by GB.  No language tags in *-UK have
been permissible to this point, and it seems unwise to begin allowing
them now (among other things, confusion with Ukraine is a reali
possibility).

>> RFC 3066bis allows currently and previously assigned code
>> elements to be used in language tags
>
> With a proper IANA registry ?  Otherwise it would be hard to
> find "previously assigned" country codes like the now (white)
> "unassigned" DD or NH.

Yes, this is a major benefit of using a registry.  Having to hunt down a
reference to a withdrawn ISO 3166 code element (not available for free
from ISO) would be a recipe for trouble.

>> The rationale is that the MA is under no obligation to
>> continue to reserve EA for Ceuta and Melilla
>
> Okay.  OTOH they changed the status of AX to "official", and EA
> could also make sense.  Especially here, I'm far from sure that
> es is the only relevant language in EA.  Or maybe it's somewhat
> different from es-ES.

If ISO ever assigns EA to Ceuta and Melilla, we can talk about it then.

> Sorry, I still don't get it, found in I-D-phillips-langtags-10:
> Why are YU and NH okay, if IM is not okay ?  Is it because NH
> and YU were once "officially assigned", and IM was not so far ?

Yes.

> Apparently the procedures work as soon as the new registry is
> created, but determining its initial state is difficult.  If
> it's supposed to be compatible with 3066, then AC, BU, DD, EU,
> FX, GG, IM, JE, NH, SF, SU, TP, UK, YU, and ZR are all invalid
> at the moment.

See http://users.adelphia.net/~dewell/lstreg.html for a proposed
"initial state."  This is all subject to review and debate, of course.

It is "backward compatible" with 3066 in that it allows all valid 3066
constructions.  It extends 3066 in this regard by re-allowing BU, DD,
FX, NH, SU, TP, YU, and ZR.  The others were never valid in language
tags and 3066bis does not change this.

> If YU, NH, and IM are open for a future debate here, then it
> could be better if you avoid these codes as examples in the
> draft.  In the worst case - if you want to be compatible with
> RfC 1766 - you need all "officially assigned" country codes
> 1996.  That would kill DD, SF, NH, and SU, but still allow TP,
> YU, and ZR.  Probably I forgot some odd cases.

Everything, including the draft itself, is open for debate.  Certainly
the proposed initial state of the registry is debatable.




More information about the Ietf-languages mailing list