LANGUAGE TAG REGISTRATION FORM

Fri Apr 11 13:09:23 CEST 2003

Michael,

> >Is there a distinction in orthography between each pair of the following?
>
> Your answer is "No", then.

Please slow down, and read what I wrote. My answer: "unknown". There is a
huge difference between "unknown" and "no". I have no idea whether there are
orthographic difference between Azeri written in Iran has
major/minor/subtle/nuanced differences from Azeri written in Azerbaijan. And
neither do you. That's why RFC 3066 does not restrict the combinations of
639 codes and 3066 codes (see below). That decentralizes the choices, and
doesn't require some centralized authority to make a call as to whether,
say, en-US is different than *every* single one of the other codes in that
list.

> Because you are asking for a bunch of bogus busywork duplicate
> registrations in order to serve the problems some software has and I
> find that objectionable. We are NOT going to encode.
>
> >en-AF, en-AL, en-DZ, en-AS, en-AD, en-AO, en-AI, en-AQ, en-AG, en-AR,
en-AM,
...
> >en-VG, en-VI, en-WF, en-EH, en-YE, en-YU, en-ZM, en-ZW
>
> Because to do so would be idiotic. 3066 lets you add country codes in

Two problems evident in these statements:

1. The assumption that the above (elided list) are not encoded. That's
wrong; they are *already* valid RFC 3066 codes. You say "To do so is
idiotic"; well, RFC 3066 already has this entire list. Is that idiotic???

2. The assumption that software requirements should have no influence on RFC
3066 registrations. If that is the position of the IETF, Harald can let me
know right now, and I won't bother pursuing this issue.

> order to make LINGUISTIC distinctions, not to perform kludges for
...
> Bull. It takes time and effort on my part, on the part of IANA, and
> on the part of all the people who AREN'T Microsoft and whatever
> "others" you refer to to weed through the mass of duplicate
> registrations in order to determine what the unique entities are.

"unique entities"???

Speaking of bull, those cows and horses are *LONG* out of the barn. That is
the point that the long list makes. Every item in that long list is a valid
RFC 3066 code, but nobody knows whether there are "real" orthographic
differences between all of them or not. (And what are the solid criteria for
such a judgment anyway? Scholars might see a difference where the layman
doesn't.)

> Or you could alter your software and make it work properly with
> language codes and locales.

Sadly, it is not our choice. And, sadly, I have no magic wand to make
Microsoft realize that no matter what its customers say, there is no need to
make these distinctions. In examples like I gave earlier on this list (the
"Theatre Centre..." example); these are NOT locale distinctions, these are
language distinctions, by any practical measure. (I do not confuse these:
differing in timezone does not, for example, qualify for a different
language ID, although it may qualify for a different locale ID.) If you want
to ignore real distinctions that people make out in the world because of
some ivory-tower notions of purism, then as I say, I hope that Harald lets
me know soon, so I can do something sensible with my time instead of
continuing this.

Mark