ISO 639 and other language identifiers

John Cowan
Tue, 7 May 2002 17:53:01 -0400 (EDT)

Caoimhin O Donnaile scripsit:

> In such cases I suppose the database could be "conservative" and
> point straight from the language to "Tai-Kadai" as the parent node,
> until such time as there were better agreement and intermediate
> nodes could be added.

However, there again is no reason, given a sample of text or sound or
video, to classify it as merely Tai-Kadai, nor is there any reasonable
human or computer process that can accept any and all Tai-Kadai texts
etc.  So the node, while unimpeachably correct, is not useful.

> However, English plus Scots is a label which people might very well
> want to use.  

[justification snipped]

Fair enough: this is a genuine use case for "English or Scots".

> According to the current (and previous) version of the Ethnologue,
> Frisian consists of three languages: Western Frisian, Northern
> Frisian and Eastern Frisian.  ISO 639-2 currently has only one code,
> "fry", 

The proposed SIL-ISO mapping treats only West Frisian as "fry", and
the others as "gem" (Germanic, other), which I think is the only
plausible mapping.

> What happens when the requests come in for
> separate language codes?  Does "fry" then become deprecated?  Or does
> it remain valid as a useful grouping?

In RFC 3066 the other two could be
registered as "gem-frisian-east" and "gem-frisian-north", or
"gem-efrisian" and "gem-nfrisian", or in various other ways.

> Even more dramatic is the case of Nedersaksisch, which has only one
> code, "nds", in ISO 639-2, but 13 language codes in the current
> Ethnologue.  It would be very tedious to have to specify all 13 codes
> in a search.

I'm confused.  ISO "nds" is precisely SIL's SAX, which has many
names.  Why do you think it is 13 languages?

John Cowan <>
I amar prestar aen, han mathon ne nen,
han mathon ne chae, a han noston ne 'wilith.  --Galadriel, _LOTR:FOTR_