What's the plan for ISO 639-3 and RFC 3066 ter?
petercon at microsoft.com
Fri Aug 20 07:48:58 CEST 2004
[comments on the first half]
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of John Cowan
> The situation as I understand it will be as follows:
Very good overview. Some minor comments:
> 1) The ISO 639 standard will draw on a single pool of three-letter
> No three-letter code will be used for more than one purpose.
> 2) ISO 639-3 will assign codes to individual languages and
> macro-languages. These codes will be identical to the existing 639-2
> codes where they exist; where there is no existing 639-2 code, they
> be identical to Ethnologue 14th edition codes where possible.
Ethnologue 14th edition was used for initial work, but not that the 15th
edition is in its last stages of production, it is the primary source.
(BTW, Ethn 15 will use the codes in the draft for 639-3, and the plan is
that in the future it will follow ISO 639.) There were cases in which
codes were changed from what Ethnologue 14 had but this change wasn't
strictly necessary. (E.g. on the first CD ballot there were a few
national body comments regarding specific cases requesting that IDs not
use sequences that were suggestive of derogatory names or that IDs for
closely-related languages be more similar.) For the vast majority,
though Ethn 14 codes were used.
> 3) ISO 639-5 will assign codes to language collections. These codes
> will be identical to the existing 639-2 codes where they exist.
> 4) ISO 639-3 and ISO 639-5 codes will be disjoint.
> 5) ISO 639-2 will specify a subset of (the union of ISO 639-3 and ISO
> codes) that specify languages which meet the restrictions of ISO 639-2
> (basically, that there are at least fifty documents in the language,
> held by at most five organizations).
> 6) ISO 639-1 will continue to specify a subset of ISO 639-2, and will
> assign two-letter codes to its members. Except for a transitional
> after the promulgation of ISO 639-3 and ISO 639-5, it will effectively
> become a closed collection.
(If it reaches 276 items, it will necessarily become a closed set!) Its
future is not entirely clear at this time; I would not have made this
> (ISO 639-4 will explain all this, and will not define any codes.)
> > The need for extlang subtags would then be muted (and might even be
> > eliminated). Only language codes that had "macro languages"
> > with them could be registered as extlangs. In fact, these subtags
> > might be cherry picked on an as-needed basis (rather than having a
> > full-fledged formal source).
> ISO 639-3 will provide a mapping between macro-languages and the
> individual languages that are parts of them.
> I don't know (and it may
> not have been decided) whether ISO 639-5 will provide a mapping
> collective codes and the languages covered by them.
I have suggested (perhaps only indirectly) that such mappings be
included in 639-5. This is still TBD, though.
> This (and its twin zh-min-bei) are the most complex cases. The vast
> majority of all macro-languages do not contain other macro-languages
> (as zh contains min).
Note that the draft for 639-3 does not include anything corresponding to
Min: since "min" couldn't be used for Min, there was no motivation from
the existence of "zh-min-nan", and there were potential concerns related
to having dolls-within-dolls (so to speak) that had not yet been
explored. I decided it was safer to allow time for such issues to be
explored first, and then to add macrolanguages such as Min once it was
agreed how any concerns could be mitigated.
> Indeed, it is doubtful whether ISO 639-3
> will provide nested macro-languages at all.
There certainly is doubt. My hunch that, through long-term use and
maintenance, we'll eventually end up with at least some nested
macrolanguages, but nothing is fixed in stone at this time (and have no
plan to specify any sanction or restrictions related to this in the text
More information about the Ietf-languages