Language tags in IPP (was: Re: [Suppress-Script] Initial list of 300 languages)

John Cowan cowan at ccil.org
Mon Mar 13 01:57:59 CET 2006


Doug Ewell scripsit:

> It also seems to have some interrelationship with the character set of 
> the print job, which seems wrong to me; figuring out which character 
> repertoires are necessary for which natural languages is a decidedly 
> non-trivial effort (ask Michael, who has done this work for the European 
> languages).

Yes.  In particular, if the charset and the language don't agree, according
to the printer's notion of "agree", the printer is free to print mojibake.

> This strongly suggests to me that when we are considering adding 
> Suppress-Script values for up to 300 languages, we should focus 
> primarily on those languages that are most likely to be used with a 
> region subtag, and spend much less time worrying about the rest.

An excellent point.  Based on Ethnologue data on national and official
languages, I find that the following languages are national or official
in more than one country:

ar Arabic, bn Bengali, ch Chamorro, da Danish, de German, el Greek, en English,
es Spanish, fr French, hr Croatian, hu Hungarian, it Italian, ko Korean, ln Lingala,
ms Malay, nl Dutch, pt Portuguese, sd Sindhi, sr Serbian, ss Swati, sv Swedish,
sw Swahili, ta Tamil, tn Tswana, tr Turkish, ur Urdu, zh Mandarin Chinese.

Of these, all have Suppress-Script values except Korean (should be Hang),
Mandarin (should be Hani), Sindhi (multiple scripts: Arab or Guru),
and Serbian (Cyrl or Latn).

> Santali, which is spoken in multiple regions, and for which a "default" 
> script assignment is not obvious.

Fortunately, the answer for Santali is known:  "multiple scripts, the users
of which are at each others' throats."  No S-S value for Santali.

-- 
Take two turkeys, one goose, four               John Cowan
cabbages, but no duck, and mix them             http://www.ccil.org/~cowan
together. After one taste, you'll duck          cowan at ccil.org
soup the rest of your life.                     http://www.ap.org
        --Groucho


More information about the Ietf-languages mailing list