Timetable for action: May 31 is suggested

Mark Davis mark.davis at jtcsv.com
Tue May 27 12:48:31 CEST 2003

1. This discussion has been going not for 24 or 48 hours, but for
about 1,400 hours. Based on that earlier discussion, I posted specific
proposals for each of the 9 tags on Wed Apr 30 15:24:20 CEST 2003:


Despite the fact that it is well known that these written languages
exist, and are distinct, the reviewer wanted specific references.
(Note that http://www.ietf.org/rfc/rfc3066.txt does not explicitly
require that specific references be supplied.) I resubmitted them May
22, in:

(and following)

The first time, I did not hold the reviewer to the two week limit as
he is required by the RFC, hoping more discussion would be productive.
If I thought that a few more weeks (on top of the many so far) would
let hold-outs review the issues carefully and come to reasoned
conclusions, that would be worth waiting for. But based on the
progress I see so far, I'm not very hopeful, so I'm planning to
prepare an appeal for the 5th of June.

2. RFC 3066 already makes many distinctions according to writing
language, as evidenced by

You say wedging scripts into languages is kind of a hack. " Look at
Ken's message:

The entire basis for RFC 3066 is to be practical, and to be able to
recognize the practical distinctions among written forms that people
must make. Allowing the productive addition of country code in any ID,
for example, allows for all of the codes on
despite the fact that the VAST majority of these are duplicates (in
the sense that there is no real distinction among the forms that they
refer to). However, IMPORTANTLY, this does not actually cause a
problem in practice. Even if 3066bis is extended to allow for script
codes generatively, the same thing would happen.

3. The vast majority of systems using RFC 3066 do, in fact, use it to
make distinctions among written language. And that is certainly the
important feature to industry and users. While on the margins, it
might be used for spoken language distinctions, the 99% case for RFC
3066 is written language. This point seems to be completely lost on
the people that object to the registrations.

It matters a lot to a great many users if a webpage is served up, and
some of the pieces on that page are in simplified Chinese and others
are in traditional Chinese. It matters a lot to Azeri users if some
pieces are Cyrillic Azeri, some in Arabic Azeri, and some pieces in
Latin Azeri. This matters a heck of a lot more than if part of a web
page is in de-1996 and part in de (pre 1996).

4. There is a clear and present need for the codes that have been
proposed. I understand that you have no sense of urgency, and that you
think that this can just meander along for a few years. (Although
based on what I can see in this group, it would more likely take
decades.) And remember, when you talk about "new frameworks" you are
asking for is NOT just a new RFC for some different framework of
organizing text, it is asking for changes in all of the Web standards
and others that use RFC 3066. Not going to happen soon!

We, and the industry, and the customers, cannot afford that. These are
important distinctions in written language, distinctions such as
between simplified and traditional Chinese, which are FAR AND AWAY
MORE IMPORTANT TO INDUSTRY THAN de-1901, i-bnn, i-enochian, yi-latn,
and most of the other registrations on

People must be able to interoperate with systems, such as Windows,
that do (*correctly*) make these distinctions. (It is more than a bit
ironic that such closed systems are far more responsive to user's
needs here that this purportedly open system.) If RFC 3066 cannot make
these distinctions, then people work around it. What will happen is
that people will make up their own mechanisms or use other IDs such as
Windows instead. In practice, you will see a proliferation of
private-use RFC 3066 codes, or worse yet, non-conformant codes RFC
3066 codes. Either is not exactly wonderful for interoperability.


More information about the Ietf-languages mailing list