Language tags in IPP (was: Re: [Suppress-Script] Initial list
of 300 languages)
John Cowan
cowan at ccil.org
Mon Mar 13 01:57:59 CET 2006
Doug Ewell scripsit:
> It also seems to have some interrelationship with the character set of
> the print job, which seems wrong to me; figuring out which character
> repertoires are necessary for which natural languages is a decidedly
> non-trivial effort (ask Michael, who has done this work for the European
> languages).
Yes. In particular, if the charset and the language don't agree, according
to the printer's notion of "agree", the printer is free to print mojibake.
> This strongly suggests to me that when we are considering adding
> Suppress-Script values for up to 300 languages, we should focus
> primarily on those languages that are most likely to be used with a
> region subtag, and spend much less time worrying about the rest.
An excellent point. Based on Ethnologue data on national and official
languages, I find that the following languages are national or official
in more than one country:
ar Arabic, bn Bengali, ch Chamorro, da Danish, de German, el Greek, en English,
es Spanish, fr French, hr Croatian, hu Hungarian, it Italian, ko Korean, ln Lingala,
ms Malay, nl Dutch, pt Portuguese, sd Sindhi, sr Serbian, ss Swati, sv Swedish,
sw Swahili, ta Tamil, tn Tswana, tr Turkish, ur Urdu, zh Mandarin Chinese.
Of these, all have Suppress-Script values except Korean (should be Hang),
Mandarin (should be Hani), Sindhi (multiple scripts: Arab or Guru),
and Serbian (Cyrl or Latn).
> Santali, which is spoken in multiple regions, and for which a "default"
> script assignment is not obvious.
Fortunately, the answer for Santali is known: "multiple scripts, the users
of which are at each others' throats." No S-S value for Santali.
--
Take two turkeys, one goose, four John Cowan
cabbages, but no duck, and mix them http://www.ccil.org/~cowan
together. After one taste, you'll duck cowan at ccil.org
soup the rest of your life. http://www.ap.org
--Groucho
More information about the Ietf-languages
mailing list