Language X within scope of language Y

Thu Feb 3 17:28:34 CET 2005

"L.Gillam" <L dot Gillam at surrey dot ac dot uk> wrote:

>> ISO 3166-2 codes are variable-length, although the first two
>> characters are always the relevant 3166-1 alpha-2 code.
>
> I read that too. A mix of alpha-2, alpha-3 and digits. I noted the 2-2
> particularly due to implications this might have had for any possible
> use in 3066 and the interpretation of the -2.

ISO 3166-2 codes can't be used in RFC 3066bis language tags, except
through a private-use subtag (e.g. "cy-GB-x-SWA").  I thought about
proposing an extension mechanism for them -- just substitute some other
single letter for the "x" -- but I don't think they meet the stability
requirements.

The issue of how to encode "England" or "Scotland" or ""Wales" or
"Northern Ireland" in an RFC 3066bis language tag remains unresolved.
ISO 3166-2 doesn't solve this problem because it doesn't provide code
elements for these "first-level divisions."  It does list the codes ENG,
SCT, WLS (alternatively CYM), and NIR respectively -- as well as CHA for
the Channel Islands and IOM for, well, IOM -- but these are informative
only; there is no such ISO 3166-2 code as *GB-ENG (I believe the
reference to "GB-CYM" is an editorial error).

>> The so-called alpha-4s are just two concatenated alpha-2 codes, and
>> code for changes in codes: thus YUCS indicates that the country
>> formerly coded YU is now coded CS.  Such changes in code usually,
>> but not always, reflect underlying changes in country name.
>
> Yes, sub-parsing:
>
> "As you can see, the code elements for formerly used country names
> have a length of four alphabetical characters (alpha-4 code elements).
> The first two characters are in all cases the original alpha-2 code
> element representing the former country name removed from ISO
> 3166(-1). Characters three and four are allocated according to rules
> established in ISO 3166-3."
>
> Yugoslavia?
>
http://www.iso.org/iso/en/prods-services/iso3166ma/03updates-on-iso-3166/nlp3i-3.html

RFC 3066bis already allows region subtags that are based on formerly
used country names (the first half of an ISO 3166-3 code).  You can
write "cs-YU", "ru-SU", "de-DD", and so on, because all these region
codes are defined in the registry.  This is one of the important
features of having a registry instead of using the ISO codes directly.

>> ISO 3166-3 is not relevant to RFC 3066 or RFC 3066bis.
>
> Your answer, I guess, depends on your definition of relevant. And,
> perhaps, whether you want to refer to ISO standards for dealing with
> Yugoslavia. I'd have thought these could be used in combination with
> the singletons of 3066bis. Perhaps not?

Just use "-YU", as shown above.

"YU" vs. "CS' is not actually the best example of this anyway, since the
ISO code "YU" had already changed scope 10 or more years ago to refer to
the new, smaller "Yugoslavia," while the breakaway republics were
assigned their own codes.  "YU" was not changed to "CS" to reflect the
new political boundaries, but rather to reflect the name change.

>> The burden of persuasion is on you.
>
> "This part of ISO 3166 provides principles and maintenance
> arrangements of a code for the representation of country names removed
> from editions 1 to 4 of ISO 3166 and the consecutive edition of ISO
> 3166-1".
>
> Sorry, but you'll have to make your own decision.

ISO 3166-3 encodes the *state of transition* from one code to another
(such as YU to CS), or from one code to many (such as SU to one of 15
republics).  The state of transition is not what's interesting as far as
language tagging is concerned, but rather the region itself (old or
new).  RFC 3066bis allows all "formerly used" ISO 3166-1 codes, except,
for obvious reasons, those which have been reassigned.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/