[Fwd]: Response to Mark's message]

Martin Duerst duerst at w3.org
Wed Apr 9 20:35:33 CEST 2003


At 18:02 03/04/09 -0500, Peter_Constable at sil.org wrote:

>Martin Duerst wrote on 04/09/2003 01:57:32 PM:
>
> > >However, the need for the addition of a script subtag to 3066bis is
>clear
> > >and present. And if 3066bis does not address that issue *very* soon,
> >
> > PLEASE!!! Stop complaining, start acting. Please submit the
> > necessary registrations for the 10 or 20 combinations that you
> > need, and follow through with these registrations.
>
>Or, alternately, is there anything that keeps one of us from beginning to
>author RFC3066bis?

I have explained why earlier, but I'm glad to repeat it (with a few
tweaks):

There are about 100 script codes. There are about
200 country/region codes, and about 500 (and increasing) language
codes. Creating 10,000,000 codes for a currently documented need
of 12 or 25 codes seems like an complete overkill.

One particular concern I have is that once there is a productive
pattern, the assumption that all the slots have to be filled in
seems to spread in an uncontrolled way. I have seen numerous examples
of tags such as 'ja-jp', which in particular as far as language goes,
doesn't give more information than simply 'ja'. I have also seen
software that insisted on always having a country/region code
in a language tag. When I tested it, I would e.g. set the language
to 'he', and then look at the HTML generated and see 'he-us', because
the software was set with 'us' as the default, and I didn't change that.
(needless to say that I didn't try out that software for more than
five minutes).

We haven't created a tag for 'Yiddish written in Hebrew', and Michael
said that he would probably have rejected it. This is another good datapoint.

Another point is that while something like az-latn/az-Cryl is very
good for language negotiation (e.g. HTTP Accept-Language/
Content-Language headers), it is really enough to mark up the
actual text (e.g. with xml:lang) with 'az' only, because the
script is self-evident from the characters used.


Regards,   Martin.


More information about the Ietf-languages mailing list