Registration of el-Latn language tag

Tex Texin tex at xencraft.com
Thu Sep 29 11:39:40 CEST 2005


Mark,

I acknowledged in my original note that precision was bad for "en" as well.
I understand the tradeoffs of a generative mechanism vs a registry.

I don't understand why we are doing both.

I am quite simply trying to understand is what it is we are doing on this
list now.
A few years ago this was a registry, and there was discussion of what was
and was not considered a language.
There seemed to be a criteria and data was cited to confirm or disconfirm
candidates.

Then we proposed a more sophisticated naming pattern and the emphasis now
seems to be whether the name fits the pattern and whether the name might be
useful, but the tie to references and language seems very removed.

And if we are to work with a generative naming scheme, I don't see why we
need to register the possibilities.
On the other hand, if have a registry, I don't see the naming scheme is all
that important.

(Moreso, as the matching rules have become more complex, and we now delight
in moving to new names that match the scheme better, but we still support
the old names, so heck we are just creating more work.)

So I am just trying to understand what the nature of the discussion is to be
when we decide whether or not to register a tag.
What is a possible argument for not encoding any language-latn? Especially,
if you accept any of a number of transliteration schemes.

Once we answer that, I hope to be able to know what I am supposed to do with
a registered tag.
Right now if I follow your guideline, I can't know anything about the
language, and I have to wonder why I should convince others that it is a
good idea to include a language tag in their web page and other meta data.

Personally, I disagree with your en-GB example. From a linguistic
standpoint, maybe it is english from the UK.
But most people would be quite happy to have their en-GB spell checker
reject most of it.
And I never said that if 5 examples were offered, that they became the sole
definition of the language. It is understood that they were representative
and not all-encompassing.

If language tags are to be used on the Web, and supported by office tools,
and to be recognizable by most users (given suitable expanded names and not
subtags), they should have meanings that typical users can relate to.
As I have said many times, I understand the need of linguists, and we have
SIL and more detailed standards with many more entries for their use.

But we should have a clear set (or subset) of tags that most users can work
with and get what they expect, and where it is unclear, we should be able to
give them a definition. And their should be a reasonable precision to the
definition.
I should be able to examine some text and determine its tag. Looking at your
example could you determine it was en-GB and not something else, perhaps not
even in the english family?

We are no longer defining anything that is of use to typical users, and if
we want en-GB to include the example you offered
then we have also undermined what users and most of the software industry
understood the tag to mean.

Now we might agree that the tag always included such text and it was
tolerated that the example was not supported by spell checkers or text to
voice devices.
But as we are expanding the number of tags, we need to guide the industry as
to their meaning and usage.
And anyone following these discussions will likely be very confused as to
the meaning of the tags.
That does not serve our industry. Or at least not the industry I thought the
IETF was about.

So what are the criteria for registering a language or prohibiting a
registration?

tex



Mark Davis wrote:
> 
> a) We have never had the accuracy that Tex is looking for. "en-US" does
> not tell me exactly what to expect; it ranges from the English used in
> "Hee Haw", to rap, to Robert De Niro's English in "Raging Bull". What is
> the practical problem that having el-Latn causes? The lack of
> spell-checkers doesn't mean it's an illegitimate tag; I don't have a
> spell-checker for the following, but it is valid en-GB:
> 
> Beatrice: Not til God make men of some other mettal then
> earth, would it not grieue a woman to be ouer-masterd with
> a peece of valiant dust? to make an account of her life to a clod
> of waiward marle? no vnckle, ile none: Adams sonnes are my
> brethren, and truely I holde it a sinne to match in my kin-
> red. [Much Ado About Nothing (Quarto) 2.1]
> 
> b). We already have a generative mechanism in 3066 primus. All of the
> tags at the end of this message work. Is this a problem? Nope.
> Generative mechanisms have a huge advantage over registrations. If I
> have an application or a customer that needs haw-DE, I can use that tag
> right now, without having to wait months or years for it to be registered.
> 
> The meaning of the tag is clear, even though the precise denotation is
> not -- but you can *never* get precision anyway. Having a registration
> citing 5 books doesn't mean that the text is *limited* to precisely what
> is described in those 5 books (all and only the words in those books,
> all and only the grammatical constructs and combinations of words in
> those 5 books)  -- such registrations would be useless.
> 
> ========
> 
> haw-AD: Hawaiian(Andorra), haw-AE: Hawaiian(United Arab Emirates),
> haw-AF: Hawaiian(Afghanistan), haw-AG: Hawaiian(Antigua and Barbuda),
> haw-AI: Hawaiian(Anguilla), haw-AL: Hawaiian(Albania), haw-AM:
> Hawaiian(Armenia), haw-AN: Hawaiian(Netherlands Antilles), haw-AO:
> Hawaiian(Angola), haw-AQ: Hawaiian(Antarctica), haw-AR:
> Hawaiian(Argentina), haw-AS: Hawaiian(American Samoa), haw-AT:
> Hawaiian(Austria), haw-AU: Hawaiian(Australia), haw-AW: Hawaiian(Aruba),
> haw-AX: Hawaiian(Aland Islands), haw-AZ: Hawaiian(Azerbaijan), haw-BA:
> Hawaiian(Bosnia and Herzegovina), haw-BB: Hawaiian(Barbados), haw-BD:
> Hawaiian(Bangladesh), haw-BE: Hawaiian(Belgium), haw-BF:
> Hawaiian(Burkina Faso), haw-BG: Hawaiian(Bulgaria), haw-BH:
> Hawaiian(Bahrain), haw-BI: Hawaiian(Burundi), haw-BJ: Hawaiian(Benin),
> haw-BM: Hawaiian(Bermuda), haw-BN: Hawaiian(Brunei), haw-BO:
> Hawaiian(Bolivia), haw-BQ: Hawaiian(British Antarctic Territory),
> haw-BR: Hawaiian(Brazil), haw-BS: Hawaiian(Bahamas), haw-BT:
> Hawaiian(Bhutan), haw-BV: Hawaiian(Bouvet Island), haw-BW:
> Hawaiian(Botswana), haw-BY: Hawaiian(Belarus), haw-BZ: Hawaiian(Belize),
> haw-CA: Hawaiian(Canada), haw-CC: Hawaiian(Cocos (Keeling) Islands),
> haw-CD: Hawaiian(Congo (Kinshasa)), haw-CF: Hawaiian(Central African
> Republic), haw-CG: Hawaiian(Congo (Brazzaville)), haw-CH:
> Hawaiian(Switzerland), haw-CI: Hawaiian(Ivory Coast), haw-CK:
> Hawaiian(Cook Islands), haw-CL: Hawaiian(Chile), haw-CM:
> Hawaiian(Cameroon), haw-CN: Hawaiian(China), haw-CO: Hawaiian(Colombia),
> haw-CR: Hawaiian(Costa Rica), haw-CS: Hawaiian(Serbia And Montenegro),
> haw-CT: Hawaiian(Canton and Enderbury Islands), haw-CU: Hawaiian(Cuba),
> haw-CV: Hawaiian(Cape Verde), haw-CX: Hawaiian(Christmas Island),
> haw-CY: Hawaiian(Cyprus), haw-CZ: Hawaiian(Czech Republic), haw-DD:
> Hawaiian(East Germany), haw-DE: Hawaiian(Germany), haw-DJ:
> Hawaiian(Djibouti), haw-DK: Hawaiian(Denmark), haw-DM:
> Hawaiian(Dominica), haw-DO: Hawaiian(Dominican Republic), haw-DZ:
> Hawaiian(Algeria), haw-EC: Hawaiian(Ecuador), haw-EE: Hawaiian(Estonia),
> haw-EG: Hawaiian(Egypt), haw-EH: Hawaiian(Western Sahara), haw-ER:
> Hawaiian(Eritrea), haw-ES: Hawaiian(Spain), haw-ET: Hawaiian(Ethiopia),
> haw-FI: Hawaiian(Finland), haw-FJ: Hawaiian(Fiji), haw-FK:
> Hawaiian(Falkland Islands), haw-FM: Hawaiian(Micronesia), haw-FO:
> Hawaiian(Faroe Islands), haw-FQ: Hawaiian(French Southern and Antarctic
> Territories), haw-FR: Hawaiian(France), haw-FX: Hawaiian(Metropolitan
> France), haw-GA: Hawaiian(Gabon), haw-GB: Hawaiian(United Kingdom),
> haw-GD: Hawaiian(Grenada), haw-GE: Hawaiian(Georgia), haw-GF:
> Hawaiian(French Guiana), haw-GH: Hawaiian(Ghana), haw-GI:
> Hawaiian(Gibraltar), haw-GL: Hawaiian(Greenland), haw-GM:
> Hawaiian(Gambia), haw-GN: Hawaiian(Guinea), haw-GP:
> Hawaiian(Guadeloupe), haw-GQ: Hawaiian(Equatorial Guinea), haw-GR:
> Hawaiian(Greece), haw-GS: Hawaiian(South Georgia and the South Sandwich
> Islands), haw-GT: Hawaiian(Guatemala), haw-GU: Hawaiian(Guam), haw-GW:
> Hawaiian(Guinea-Bissau), haw-GY: Hawaiian(Guyana), haw-HK: Hawaiian(Hong
> Kong S.A.R., China), haw-HM: Hawaiian(Heard Island and McDonald
> Islands), haw-HN: Hawaiian(Honduras), haw-HR: Hawaiian(Croatia), haw-HT:
> Hawaiian(Haiti), haw-HU: Hawaiian(Hungary), haw-ID: Hawaiian(Indonesia),
> haw-IE: Hawaiian(Ireland), haw-IL: Hawaiian(Israel), haw-IN:
> Hawaiian(India), haw-IO: Hawaiian(British Indian Ocean Territory),
> haw-IQ: Hawaiian(Iraq), haw-IR: Hawaiian(Iran), haw-IS:
> Hawaiian(Iceland), haw-IT: Hawaiian(Italy), haw-JM: Hawaiian(Jamaica),
> haw-JO: Hawaiian(Jordan), haw-JP: Hawaiian(Japan), haw-JT:
> Hawaiian(Johnston Island), haw-KE: Hawaiian(Kenya), haw-KG:
> Hawaiian(Kyrgyzstan), haw-KH: Hawaiian(Cambodia), haw-KI:
> Hawaiian(Kiribati), haw-KM: Hawaiian(Comoros), haw-KN: Hawaiian(Saint
> Kitts and Nevis), haw-KP: Hawaiian(North Korea), haw-KR: Hawaiian(South
> Korea), haw-KW: Hawaiian(Kuwait), haw-KY: Hawaiian(Cayman Islands),
> haw-KZ: Hawaiian(Kazakhstan), haw-LA: Hawaiian(Laos), haw-LB:
> Hawaiian(Lebanon), haw-LC: Hawaiian(Saint Lucia), haw-LI:
> Hawaiian(Liechtenstein), haw-LK: Hawaiian(Sri Lanka), haw-LR:
> Hawaiian(Liberia), haw-LS: Hawaiian(Lesotho), haw-LT:
> Hawaiian(Lithuania), haw-LU: Hawaiian(Luxembourg), haw-LV:
> Hawaiian(Latvia), haw-LY: Hawaiian(Libya), haw-MA: Hawaiian(Morocco),
> haw-MC: Hawaiian(Monaco), haw-MD: Hawaiian(Moldova), haw-MG:
> Hawaiian(Madagascar), haw-MH: Hawaiian(Marshall Islands), haw-MI:
> Hawaiian(Midway Islands), haw-MK: Hawaiian(Macedonia), haw-ML:
> Hawaiian(Mali), haw-MM: Hawaiian(Myanmar), haw-MN: Hawaiian(Mongolia),
> haw-MO: Hawaiian(Macao S.A.R., China), haw-MP: Hawaiian(Northern Mariana
> Islands), haw-MQ: Hawaiian(Martinique), haw-MR: Hawaiian(Mauritania),
> haw-MS: Hawaiian(Montserrat), haw-MT: Hawaiian(Malta), haw-MU:
> Hawaiian(Mauritius), haw-MV: Hawaiian(Maldives), haw-MW:
> Hawaiian(Malawi), haw-MX: Hawaiian(Mexico), haw-MY: Hawaiian(Malaysia),
> haw-MZ: Hawaiian(Mozambique), haw-NA: Hawaiian(Namibia), haw-NC:
> Hawaiian(New Caledonia), haw-NE: Hawaiian(Niger), haw-NF:
> Hawaiian(Norfolk Island), haw-NG: Hawaiian(Nigeria), haw-NI:
> Hawaiian(Nicaragua), haw-NL: Hawaiian(Netherlands), haw-NO:
> Hawaiian(Norway), haw-NP: Hawaiian(Nepal), haw-NQ: Hawaiian(Dronning
> Maud Land), haw-NR: Hawaiian(Nauru), haw-NT: Hawaiian(Neutral Zone),
> haw-NU: Hawaiian(Niue), haw-NZ: Hawaiian(New Zealand), haw-OM:
> Hawaiian(Oman), haw-PA: Hawaiian(Panama), haw-PC: Hawaiian(Pacific
> Islands Trust Territory), haw-PE: Hawaiian(Peru), haw-PF:
> Hawaiian(French Polynesia), haw-PG: Hawaiian(Papua New Guinea), haw-PH:
> Hawaiian(Philippines), haw-PK: Hawaiian(Pakistan), haw-PL:
> Hawaiian(Poland), haw-PM: Hawaiian(Saint Pierre and Miquelon), haw-PN:
> Hawaiian(Pitcairn), haw-PR: Hawaiian(Puerto Rico), haw-PS:
> Hawaiian(Palestinian Territory), haw-PT: Hawaiian(Portugal), haw-PU:
> Hawaiian(U.S. Miscellaneous Pacific Islands), haw-PW: Hawaiian(Palau),
> haw-PY: Hawaiian(Paraguay), haw-PZ: Hawaiian(Panama Canal Zone), haw-QA:
> Hawaiian(Qatar), haw-QO: Hawaiian(Outlying Oceania), haw-RE:
> Hawaiian(Reunion), haw-RO: Hawaiian(Romania), haw-RU: Hawaiian(Russia),
> haw-RW: Hawaiian(Rwanda), haw-SA: Hawaiian(Saudi Arabia), haw-SB:
> Hawaiian(Solomon Islands), haw-SC: Hawaiian(Seychelles), haw-SD:
> Hawaiian(Sudan), haw-SE: Hawaiian(Sweden), haw-SG: Hawaiian(Singapore),
> haw-SH: Hawaiian(Saint Helena), haw-SI: Hawaiian(Slovenia), haw-SJ:
> Hawaiian(Svalbard and Jan Mayen), haw-SK: Hawaiian(Slovakia), haw-SL:
> Hawaiian(Sierra Leone), haw-SM: Hawaiian(San Marino), haw-SN:
> Hawaiian(Senegal), haw-SO: Hawaiian(Somalia), haw-SR:
> Hawaiian(Suriname), haw-ST: Hawaiian(Sao Tome and Principe), haw-SU:
> Hawaiian(Union of Soviet Socialist Republics), haw-SV: Hawaiian(El
> Salvador), haw-SY: Hawaiian(Syria), haw-SZ: Hawaiian(Swaziland), haw-TC:
> Hawaiian(Turks and Caicos Islands), haw-TD: Hawaiian(Chad), haw-TF:
> Hawaiian(French Southern Territories), haw-TG: Hawaiian(Togo), haw-TH:
> Hawaiian(Thailand), haw-TJ: Hawaiian(Tajikistan), haw-TK:
> Hawaiian(Tokelau), haw-TL: Hawaiian(East Timor), haw-TM:
> Hawaiian(Turkmenistan), haw-TN: Hawaiian(Tunisia), haw-TO:
> Hawaiian(Tonga), haw-TR: Hawaiian(Turkey), haw-TT: Hawaiian(Trinidad and
> Tobago), haw-TV: Hawaiian(Tuvalu), haw-TW: Hawaiian(Taiwan), haw-TZ:
> Hawaiian(Tanzania), haw-UA: Hawaiian(Ukraine), haw-UG: Hawaiian(Uganda),
> haw-UM: Hawaiian(United States Minor Outlying Islands), haw-US:
> Hawaiian(United States), haw-UY: Hawaiian(Uruguay), haw-UZ:
> Hawaiian(Uzbekistan), haw-VA: Hawaiian(Vatican), haw-VC: Hawaiian(Saint
> Vincent and the Grenadines), haw-VD: Hawaiian(North Vietnam), haw-VE:
> Hawaiian(Venezuela), haw-VG: Hawaiian(British Virgin Islands), haw-VI:
> Hawaiian(U.S. Virgin Islands), haw-VN: Hawaiian(Vietnam), haw-VU:
> Hawaiian(Vanuatu), haw-WF: Hawaiian(Wallis and Futuna), haw-WK:
> Hawaiian(Wake Island), haw-WS: Hawaiian(Samoa), haw-YD:
> Hawaiian(People's Democratic Republic of Yemen), haw-YE:
> Hawaiian(Yemen), haw-YT: Hawaiian(Mayotte), haw-ZA: Hawaiian(South
> Africa), haw-ZM: Hawaiian(Zambia), haw-ZW: Hawaiian(Zimbabwe)
> 
> Harald Tveit Alvestrand wrote:
> 
> >
> >
> > --On onsdag, september 28, 2005 04:19:18 -0700 Tex Texin
> > <tex at xencraft.com> wrote:
> >
> >> It's a good reason not to Register generative tags.
> >>
> >> So when someone requests a tag now, what is the reviewer to look at?
> >>
> >> We used to identify a few representative books, which I always thought
> >> meant we were identifying a particular set of rules around the language
> >> (spelling, orthography).
> >>
> >> The registration for el-Latn more or less stipulates the need for
> >> transliteration, mentions that they exist, with  a link to a site that
> >> collects transliteration systems. (Which btw, I think is a really bad
> >> idea
> >> in the event the site goes away or completely changes its list of
> >> reference materials.) But it doesn't really nail down what it is. (It
> >> mentions a standard, but doesn't say the tag is referring to that
> >> particular standard.
> >>
> >> So we are no longer identifying a reference or a particular language,
> >> but
> >> just the concept that there seems to be something like a language of
> >> this
> >> persuasion. I guess we were asking for this with es-419. (Which I was
> >> also a proponent of.)
> >
> >
> > Yep.
> >
> > For this particular request, one reason why I don't care much what the
> > registration says is that we have a community consensus that anyone
> > using a tag consisting of a 639 language and a script doesn't have to
> > register it - he can just use it if he thinks it's appropriate for his
> > content.
> >
> > So we'd better get used to a world where we encounter tags of that
> > nature.
> >
> > Translation is not a new subject for this list - transliteration
> > warranted a section in Peter Constable's 2002 SIL report
> > <http://www.unicode.org/notes/tn8/SILEWP2002-003.pdf>, and I've found
> > mention of transliteration in the archives in 1998. (Google doesn't
> > seem to search the old archives consistently, however.. I'd better
> > move them....)
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > Ietf-languages mailing list
> > Ietf-languages at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/ietf-languages
> >
> >
> >

-- 
-------------------------------------------------------------
Tex Texin   cell: +1 781 789 1898   mailto:Tex at XenCraft.com
Xen Master                          http://www.i18nGuy.com
                         
XenCraft		            http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------


More information about the Ietf-languages mailing list