New Last Call: 'Tags for Identifying Languages' to BCP

Peter Constable petercon at microsoft.com
Mon Dec 13 08:05:42 CET 2004


> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Bruce Lilly

> > That misses the point entirely. The point is that IDs used by
political
> > administrations may change for any number of reasons, and those
> > admministrations may have no qualms with such changes;
> 
> For such changes to become enshrined in an ISO standard
> requires a bit more than a mere whim on the part of one
> party; in the case of the particular ISO standards under
> discussion, it requires convincing the duly appointed
> maintenance authority to make the change.

The ISO 3166 MA maintains that standard in accordance with the
identifiers specified by the UN Statistics Division; a change by the UN
is all the convincing that is required.



> > If for whatever reason ISO and the UN decided that "US" should
> > be used to designate the country of France, I doubt you'd expect
every
> > software vendor to update all of their deployed installations to use
> > "fr-US" instead of "fr-FR", and for every user to go through every
data
> > repository they manage to make such changes in their data.
> 
> The only way that would be likely to happen would be if
> there were no longer a "US" *and* if the ISO and UN
> representatives of France were to initiate a request for
> such a change.  One would presume that they would have
> good reason to do so, and could explain said reasons in
> order to convince their ISO and UN counterparts to agree
> to the change.  Under those hypothetical circumstances, I
> can only assume that software vendors who care about such
> matters would either agree with the hypothetical reasons
> or would have acted to convince those in favor of the change
> of reasons to avoid the change.

This scenario is not hypothetical; it actually occurred in the case of
CS. The change was solely under the control of the UN Statistics
Division; it is not part of their process to consult with developers and
users of IT systems in general, and they were not consulted in this
case. They were completely powerless to influence the change, learning
about it only after the fact.

This is a situation we do not intend to repeat.


> And while I would not
> expect users to retroactively change documents any more than
> I would expect coins and paper money to be reissued with old
> dates but new designations of country name, I would expect
> that as of the agreed-upon effective date of the change that
> new documents would be prepared in accordance with the new
> standard.  It's difficult to be more precise about such a
> wild hypothetical, but consider similar changes made to
> time zones...

I have no interest in considering time zones; I will leave that to
people that solve problems related to time zones. Content cannot be
assumed to have any language tags in its metadata tagged to indicate
version, so versioning is not a general solution. There are ways that we
can impose stability, and that is what we seek and intend to do.



> > The usability flaw in treating ISO 639 and ISO 3166 as
human-readable is
> > evident in the confusion between ja and JP (or is it jp and JA?),
and GB
> > vs UK.
> 
> Without looking I can easily tell that jp and uk are country
> codes precisely *because* they are well-known as TLDs.

It is not uncommon for users to confuse "JA" and "JP". 

"UK" is not an officially-assigned ISO 3166 country code; it's status is
"exceptionally reserved". The alpha-2 ID listed in ISO 3166-1 for United
Kingdom is "GB".


> > As for what is silly, if the UN country ID for Canada changed to
> > CN (and that for PRC changed to something else), I'm sure it would
cause
> > far greater problems for users to have to change the last two
letters in
> > domain names than for them to keep doing what they always did.
> 
> And it is precisely because of such problems that it is
> as unlikely to happen as your hypothetical FR->US change.

Again, not hypothetical at all.


> > > > Neither RFC 1766 or RFC 3066 has ever presented "official"
> > translations;
> > >
> > > Both defer to the ISO lists for definitions (not "translations")
> > > of the various codes.
> >
> > Definitions; not language names for display use.
> 
> Feh. Whatever. The human-readable stuff that corresponds
> to the code which you say shouldn't be read.   The stuff
> without which codes are meaningless.  The stuff without
> which two communicating parties cannot agree on the meaning
> of "XX".

The stuff that defines the meaning of "XX" where "XX" comes from an ISO
standard is the ISO standard. That was the case in RFC 3066. This draft
does not change that; it merely provides some info that may save you
having to go look up the ISO standard, but that info is not the last
word.


> So, you're saying that the ISO definition of "CS" as
> "Serbia and Montenegro" will continue to be valid, with
> that meaning, in a language-tag?

The meaning of an ID in the registry that came from an ISO standard is
the meaning it had in the version of that ISO standard from which it was
obtained. (Typically, that is the current version of the ISO standard at
the time the ID is added to the registry, though the initial registry
being prepared will have some exceptions to resolve pre-existing
ambiguous cases, such as CS.)

If you're really wanting to know what the meaning of "CS" would be per
the proposed draft, the proposal is that it will forever remain valid
with the meaning "Czechoslovakia" as it was originally defined in ISO
3166.


> The foolishness is your insistence on trying to tie
> the definitions to a localization issue.

It was you who established it as a localization issue, very clearly:

<quote>
> Surely, though, this is not a technical argument against the proposal.

Not purely technical, though it presents problems for existing
implementors who provide bilingual support.
Eliminating bilingual descriptions for the language, country (and UN
region) codes leaves implementors in a quandary.
</quote>



> I haven't specifically discussed "display names"; that is your
> assertion, and not my basis for objection.

You didn't use the term "display names", but it is clearly implied by
your reference to bilingual implementations.


> I refer to the
> definitions and the need to map to and from those definitions
> at either end of the communications channel.  Whether or not
> that happens by "display" is incidental to the issue of the
> number of languages that the definitions are provided in.

Definitions in multiple languages are not a requisite to establishing
the denotation of a coded element. There are widely-adopted coding
standards that establish denotations using one language only. In this
case, though, the denotations of ISO IDs is established by the ISO
standard (and particular version of that standard) from which they were
obtained. The registry contains a description that dismbiguates which
ISO definition is to be used, but is not a replacement for the ISO
definition.


Peter Constable
Microsoft Corporation


More information about the Ietf-languages mailing list