possible additions

Fri, 3 May 2002 11:32:45 -0500

[copying to another relevant list with only partially overlapping
membership; apologies to those that receive duplicates]

On 05/03/2002 10:29:11 AM Michael Everson wrote:

>>Pavla has begun attempting something I have held back from doing for the
>>past two years: mass registrations. I suspected this was bound to happen
>>sooner or later.
>
>So did I.

[snip]

>>I see this mainly as a test of the willingness and ability of ISO to move
>>forward quickly with extension of the ISO 639 family of languages.
>
>Um, Peter, this isn't ISO, it's IETF.

It is? I must be lost. ("I knew I shoulda taken dat left toin at
Albequerque." :-)

My comment may seem out of place on this list since this isn't ISO, yet the
Convener of ISO/TC 37/SC 2/WG 1 is present, as are some others who have
been involved in the work of WG 1. I guess my comments were primarily for
their benefit.

>Are you suggesting that we give
>this over to ISO or that we look at registering them here?

I don't have a simple answer for that. Allow me to explain.

Obviously Pavla can proceed to rework his requests into the necessary
format. I have held back from pursuing a large number of additional
registrations with IANA for the past 14 months after I started seeing
indication from ISO folk that there was interest in significanctly
extending the ISO 639 standard. It seems to me that we don't need lang IDs
for a large number of languages in both places. If these get added to the
IANA registry now and then ISO comes up with an extension to ISO 639 in the
near future that provides codes for all of these as well, then the IANA
codes will eventually be deprecated. (There might be a time delay -- if ISO
adds new codes in a new part of the ISO 639 family, e.g. an ISO 639-3 --
then those would not be available without registration under RFC 3066 since
it does not identify ISO 639-3 as a source. But I'd be inclined to expect a
successor to RFC 3066 that did reference a new ISO 639-3 to appear before
long, and at that point codes we add now would get deprecated.) So, it
makes sense in the long term to look for codes for these languages to come
from ISO.

On the other hand, Pavla, quite understandably, doesn't want to wait a long
time to get what is needed by him and the agency he represents (Health
Level Seven). This is the crux of the issue that ISO folk need to consider.
They have been aware that industry needs for more expanded coverage were
imminent, and that it has behoven them to act expeditiously to avoid the
confusion that could result from duplication if others who can't keep on
waiting begin creating codes.

But it's an issue for this group as well: What do we want in the long term?
Do we want to keep the IANA registry limited to special cases of
individual-language definition or to codes for derivative notions (e.g.
orthographies, as in the recent German case), leaving the source for the
bulk of codes for individual languages to come from ISO 639? Or do we want
to disregard whatever future extensions to ISO 639 may (and are likely to)
appear, and in so doing invite a large number of new registrations? (If so,
we should start discussing process, because I'm going to want to start
generating a large number of requests.) Or do we want to try to permit
Pavla to pursue his requests yet hope that no other large requests appear
before ISO can come up with something more comprehensive?

Personally, I'm not certain which of these choices to recommend. On the one
hand, I think it makes most sense to keep the IANA registry limited and to
allow ISO 639 to be the source for the vast majority of codes for
individual languages. (I'm assuming a willingness on the part of ISO to
provide a comprehensive set of codes, and I have seen indications that that
is not an unrealistic expectation.) At the same time, user needs are real,
and I can easily sympathise with someone like Pavla not wanting to wait for
a large bureaucratic process to provide what is needed. I myself have been
very tempted to begin making mass requests for IANA registrations, but I've
held off because I have first needed to assess what would need to happen
for the languages of interest to me to be supported in major industry
protocols and platform infrastructures (e.g. MS Windows i18n
infrastructure).

I hope there is a real willingness of all the members here, and also of
major industry and standards-body stakeholders, to take a serious look at
where we want to go in the long term in relation to language identification
issues, and a willingness to start making rapid progress toward necessary
long-term solutions. I think it will be in the best interest of all of us
and of users at large if we do so.

- Peter

---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <peter_constable@sil.org>