Region subtags under 3066 and 3066bis (long)

Frank Ellermann nobody at xyzzy.claranet.de
Sun Feb 20 05:56:14 CET 2005


Doug Ewell wrote:

> 2005-01-01 is the only cutoff date I see.

The 3066bis draft-08 just failed in its "last call", now it's
split into two drafts.  So that's not yet ready.  You probably
want to use the date of the publication as one of the future
cut-off dates, because that's essential for compatibility with
RfC 3066.  2005-01-01 isn't one of the relevant cut-off dates.

> That could certainly be changed to the date of publication
> (hopefully 2005-xx-xx), as long as no ISO codes are reused
> between now and then.

_Especially_ if some ISO codes are changed, because the changes
affect users of RfC 3066 immediately.  Let's say 200x-xx-xx to
be sure.

> ISO 3166-1 began in 1974.
[...]
> RFC 1766 was published in March 1995,
> RFC 3066 in January 2001.

>>| [ISO 3166]  ISO 3166:1988 (E/F)
>>[...]
>>| Standardization, 3rd edition, 1988-08-15.

Whatever you do, don't pick 1974.  A very conservative choice
justified by RfC 3066 is apparently 1988-08-15.

>> RH was never allowed under 1766/3066.

> As I said before, for consistency with the principle of
> allowing more recently withdrawn codes such as TP and YU.

These cases are bad enough, adding obsolete codes like RH only
makes it worse for a future RH etc.

> In any case, that seems a legitimate topic for discussion.

 [3166:1988]
> If we did this, there might be a problem with ISO 639
> language codes.

No, they promised to be sensible, it's a different standard.

> users were advised for YEARS AND YEARS afterward to use the
> old codes in tagging their content rather than the new ones,
> because "software would be more likely to recognize them."
> This seems silly to me, but it is a fact

Backwards compatibility isn't silly.  Changing codes without a
compelling reason is silly.  And it's the point of your future
registry with its persistent entries.

But you don't need backwards compatibilty with something which
was never valid under RfC 1766 and RfC 3066 like the former RH.

In that case the best you can get is compatibilty with a future
RH, i.e. don't block it by an obsolete code.

> 200 is used for Czechoslovakia because CS was taken by Serbia
> and Montenegro.  There is no date associated with this, other
> than the one and only 2005-01-01 cutoff date that says CS has
> its new meaning and not its old one.

But on 2005-01-01 there was no valid code 200 in the UN list,
it was removed 1993-01-01, as stated on:
<http://unstats.un.org/unsd/methods/m49/m49chang.htm>

It's not that I'm completely against it.  I only want to know
why you have 200 (the former CS, now CZ and SK) but not 582
(the former PC, now PW, FM, MH, and MP).  I proposed to remove
the obsolete PC, depending on your reasons that could justify
to add 582.  Or to get rid of 200, because it's old like 582.

 [Why no numbers for the old AI, GE, and SK]
> No, it is because these former entities have no numeric code.

The old AI is now DJ, therefore you don't need 262.
The old GE is now KI, therefore you don't need 296.

The old SK is now a part of India (356).  Apparently that was
before they started with their numbers, because there is no
old number for Sikkim.  Your SKIN source says 1975.  Oops, and that
source says GEHH 296 claiming that the successors are KI
(296) _and_ TV (798).  Beats me.  But I proposed to remove GE.

 [start of ISO 639 confusion]
> There is no entity that has one code but not the other.  This
> is different from ISO 639.

Sorry, I was talking about 639 when I said alph-2 and alpha-3.

>> so you essentially copy all alpha-3 codes without alpha-2
>> alias to your alpha-3 section of the registry.

> There are no such codes.
 [end of ISO 639 confusion, probably we agree on this part]

Back to the obsolete country codes:

> Time out.  First we all need to sit down (figuratively) and
> talk about this, and decide if it is the right thing to do.
> That may not be obvious.  If this is decided, changing the
> list to remove BQ and friends is trivial.

Okay, talk about it, I'm on record with "anything but 1974" ;-)

> The rule, as I said before, is whether the new code
> corresponds exactly to the same plot of land as the old code.
> For BUMM it does.  For DDDE it does not.  Does you agree?

No.  CS is not exactly the same plot of land as YU before 1992.
You accepted YUCS but not DDDE, VDVN, or YDYE.  Either follow
ISO 3166 or trash it completely.  At least for DD there's no
problem if you say that all their languages are now used in DE
(de, nds, dsb, hsb, de-1901, de-1996). OTOH fy-DE is ambiguous
(one of two fy which are not fy-NL), fy-DD would make no sense.

> There is no way around the fact that an entity can change
> boundaries while its ISO 3166 code remains the same.

In the case of DE, YE, and VN they used to be one entity about
60 years ago, and they are now again one entity since more than
15 years.  You could try tricks with nds-DD versus nds-DE, but
it would not work.  The old DD and the old DE are both parts of
the new DE.  The old DD had no fy, the old DE had no dsb / hsb.

Forget any "exactly the same plot of land" rule.  Were have you
found it, somewhere in the draft, something about the "soil" ?

> We can't deprecate RFC 1766/3066 tags that use ISO 3166
> codes.  They are everywhere.

The problems start if you want to add another dimension like
scripts.  Two dimensions language + region somehow work, but
4 dimensions language + script + region + variant are a PITA.

As in en-boont, en-Latn-boont, en-US-boont, en-Latn-US-boont.
it's a bit like the mathematical problem of flattening graphs.
A concept of regions restricted to "country codes" also isn't
very convincing for some bigger countries like the US or GB, if
you (ab)use it for languages.

>> The sky will fall on a _future_ FQ if they can't use their
>> own country code like almost everybody else in the world,
>> because you decided that it stands for an uninhabited
>> territorial claim in AQ acknowledged by neither the UN nor
>> the US.

> If ISO 3166/MA disregards the will of the world and reassigns
> FQ, it will be to a new entity defined by UN.  That entity
> will have been assigned a UN M.49 numeric code, RFC 3066bis
> will use that, and the sky will stay right where it is.

Wait a moment, I certainly love to bash ISO 3166, but it's not
their problem if _you_ revive obsolete codes which have been
removed decades ago like FQ.  And the population of a future FQ
will hate the obscure UN number and you, when everybody else
(maybe minus GG, IM, and JE) has "real" alpha-2 country codes.

                             Bye, Frank




More information about the Ietf-languages mailing list