Many languages - how to code?

John Clews Scripts2@sesame.demon.co.uk
Sat, 04 May 2002 09:26:05 GMT


Keld Simonsen wrote, [re. (iso639.546) some proposals to ietf...]

> I am forwarding this message to the iso639 group as we could also
> consider these additions to ISO 639

Pavla A. Frazier had written:

> Dear IETF Language list serve group,
> 
> ISO version 1 and 2 combined include 36 language names for American
> Indian and Alaska Native languages, 9 of which are groupings.  These 9
> groups map to 62 individual names in a set of currently used American
> Indian and Alaska Native languages that I am proposing (see below)...
> There are a total of 159 individual language names that I want to
> submit for your consideration....

In passing, I also note that many of the names are listed in the
Ethnologue, published by the Summer Institute of Linguistics (SIL),
and that 3-letter SIL codes for these have been in well established
use for several decades in many cases, and that these SIL 3-letter
codes are well used by linguists, and in some organisations, Unesco
being one that has cited their use from time to time. Citation:

    Grimes, Barbara F., Pittman, Richard S. and Grimes, Joseph E.
    Ethnologue Languages of the World. - 14th ed.
    SIL International, 2000.

Given Pavla A. Frazier's large number of proposals, in my view the
time is _long_ overdue for ISO/TC37/SC2/WG1 (Language Codes), the ISO
639 Joint Advisory Committee, the ISO/TC37/SC2 Language Codes Task
Force, the E-MELD project, SIL and IETF to really sit down and decide
on a proper strategy for extending the number of languages which can
be coded, and used in ICT systems, without compromising existing use.

Piecemeal or one by one code allocation is just _not_ going to meet
user needs, either in IETF or in ISO, at the rate made by the ISO 639
Joint Advisory Committee.

So far from ISO we have some some proposals, but in no great level of
detail, some codings, though far fewer than have been requested over
several years from various quarters, and several committees and
interest groups connected by email lists. In my more cynical moments
I'm tempted to suggest that there are almost more committees etc.
(see the above list) than there are language codes, certainly in
ISO 639, ISO 639-2 and registrations made by the ISO 639 Maintenance
Agency.

Most of the languages requested to ISO by various groups over the years,
for which there are still no ISO codes _are_ listed by SIL in the
Ethnologue database, which is widely used.

Pavla A. Frazier's request (via Keld Simonsen) is not the first time
that needs for a large number of language codes have not been met by
ISO.

CEN/TC304 (Information and Communications Technologies: European
Localization Requirements) requested several tens of codes from the
ISO 639 Maintenance Agency several years ago, and the requests were
ignored.

Nor have there been codes allocated for many of the same list of
languages by the ISO 639 Joint Advisory Committee.

So far there is no mechanism either by ISO, or by IETF, to allow use
of these codes as standard tags. If some mechanism had existed, the
reply to Pavla A. Frazier (and to CEN/TC304) would have been simple:
yes - here's the codes, and here's what you do, in relation to IETF
tags, or in relation to ISO codes.

Currently we could face the possible scenario that there could be
three different schemas in operation: SIL codes, ISO codes, and IETF
tags for the same languages. Hopefully it won't come to that, bu
unless and until a sensible modus operandi is reached, rather than
ignoring the exisitence of the SIL Codes, people are likely to be
ill-served by what is happening at present.

Next week, there is the W3C Internationalization Workshop in Dublin,
followed by the International Unicode Conference. I hope that
those reading this who are involved in Dublin may be able to make use
of this to discuss some of these issues, and hopefully to make some
suggestions that could help to improve the current state of things.

I look forward to any reactions, either this week before the W3C
Internationalization Workshop in Dublin, and the International
Unicode Conference, or during either of those meetings.

Dealing with enabling many more languages to be coded, so that they
can be used by users in ICT systems, is an increasinly urgent need.

Best regards

John Clews

--
John Clews,
Keytempo Limited (Information Management),
8 Avenue Rd, Harrogate, HG2 7PG
Email: Scripts2@sesame.demon.co.uk
tel: +44 1423 888 432;

Committee Member of ISO/IEC/JTC1/SC22/WG20: Internationalization;
Committee Member of ISO/TC37/SC2/WG1: Language Codes