Scottish English

Doug Ewell dewell at adelphia.net
Sat Oct 22 20:55:19 CEST 2005


Keld Jørn Simonsen <keld at dkuug dot dk> wrote:

> You are probably right, but I think that ISO 31&&-2 looks like the
> most promising spec to use for further qualifying a territory, eg to
> specify a dialect.

Mark and John are correct.  ISO 3166-2 tracks the internal subdivisions
of countries, *as those countries define them*, and its code elements
are therefore *much* more subject to arbitrary and capricious change
than in ISO 639 or 3166-1 (or 15924).

It's true that for some countries, such as the United States, the
definitions and boundaries of states are quite stable, but this is not
uniformly true across the world, and in any case the U.S. has split and
merged various of its dependencies over the years, resulting in changes
to ISO 3166-2.

I used to favor the use of ISO 3166-2 for identifying language
variations.  Certainly there are the concepts of "California English"
and "Texas English" and "Scottish English," language variants commonly
identified with subnational regions that have code elements.  But this
doesn't work nearly as well for other regions, even more so than for ISO
3166-1: who can say how "North Dakota English" differs from "South
Dakota English"?  The use of ISO 3166-1 is enshrined in language
tagging -- "we're stuck with it" -- but ISO 3166-2 is not.  Registering
variants one-by-one would be preferable, as John said.

> Could it be a way forward to try to remedy the problems of ISO 3166-2
> that Mark and John mention?

The instability of codes is a fact of life for internal subdivisions of
countries.  Ask Gwillim Law, who maintains a site all about them
(statoids.com) and has invented his own coding system for them; even he
has to change his codes to reflect real-life changes.

Besides that, there is the fact that ISO does not give away the 3166-2
code list for free (unlike 3166-1), resulting in an accessibility
problem, and the fact that 3166-2 code elements can be anywhere from 1
to 3 characters long, which means they may conflict with other elements
of RFC 3066bis language tags.

--
Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/




More information about the Ietf-languages mailing list