Registration of el-Latn language tag

Mark Davis mark.davis at icu-project.org
Thu Sep 29 15:03:27 CEST 2005


 > I don't understand why we are doing both.

Take a look at http://www.inter-locale.com/ID/why-rfc3066bis.html, which 
spells it out in more detail.

In the case of en-Latn, the tag simply says that it is Modern Greek, 
written in the Latin script. (Modern, simply because ISO 639 
distinguishes it from grk.) That tag is certainly understandable, and 
thus is nicely representable with a generative mechanism. Simply because 
a tag has a wide sphere of denotation doesn't mean there is a problem.

Anyone familiar with modern Greek will understand that there is much 
variation in the spelling of languages written in transliteration, such 
as el-Latn: for χαρακτήρα I could see charaktḗra, but also see variation 
between ch and kh, and between ḗ, é, í, e, and i, giving some 10 
different combinations. If and when it becomes really necessary to 
distinguish among them, we'll see important variants registered; one 
could forsee an el-Latn-UNGEGN, for example. Thus we see the combination 
of a generative mechanism with registration. And UNGEGN is a good 
example of variant tag that could meaningfully and usefully apply to a 
large number of combinations of language, script, and region.

Similarly, en-GB, with en defined as in ISO 639 is perfectly 
understandable, and encompasses a wide range of variation. Simply 
because a tag has a wide sphere of denotation doesn't mean there is a 
problem. Of course, most of the time, what you will see is modern 
standard British conventions, so it would be silly to not use a 
spell-checker because it might be Shakespearian. If and when it gets to 
the point where it is really necessary to make that distinction, we 
should see either a extlang (by that time) or a variant being registered.

It is perfectly sensible to say "Mark Davis, living in Menlo Park". The 
fact that it denotes some 11 people doesn't mean that that phrase is 
useless -- far from it. It dramatically limits the possibilities; you 
know it is not John Smith, living in Plano, or anything like that. And 
typically the context is sufficient to disambiguate *to the extent that 
someone cares*.  If and when it is required to be more specific than 
that, it is possible to specify more precisely what is meant.

 > So what are the criteria for registering a language or prohibiting a 
registration?

See
http://www.inter-locale.com/ID/draft-ietf-ltru-registry-13.html#registrationProc 
and 
http://www.inter-locale.com/ID/draft-ietf-ltru-registry-13.html#possibleReg

Tex Texin wrote:

>Mark,
>
>I acknowledged in my original note that precision was bad for "en" as well.
>I understand the tradeoffs of a generative mechanism vs a registry.
>
>I don't understand why we are doing both.
>
>I am quite simply trying to understand is what it is we are doing on this
>list now.
>A few years ago this was a registry, and there was discussion of what was
>and was not considered a language.
>There seemed to be a criteria and data was cited to confirm or disconfirm
>candidates.
>
>Then we proposed a more sophisticated naming pattern and the emphasis now
>seems to be whether the name fits the pattern and whether the name might be
>useful, but the tie to references and language seems very removed.
>
>And if we are to work with a generative naming scheme, I don't see why we
>need to register the possibilities.
>On the other hand, if have a registry, I don't see the naming scheme is all
>that important.
>
>(Moreso, as the matching rules have become more complex, and we now delight
>in moving to new names that match the scheme better, but we still support
>the old names, so heck we are just creating more work.)
>
>So I am just trying to understand what the nature of the discussion is to be
>when we decide whether or not to register a tag.
>What is a possible argument for not encoding any language-latn? Especially,
>if you accept any of a number of transliteration schemes.
>
>Once we answer that, I hope to be able to know what I am supposed to do with
>a registered tag.
>Right now if I follow your guideline, I can't know anything about the
>language, and I have to wonder why I should convince others that it is a
>good idea to include a language tag in their web page and other meta data.
>
>Personally, I disagree with your en-GB example. From a linguistic
>standpoint, maybe it is english from the UK.
>But most people would be quite happy to have their en-GB spell checker
>reject most of it.
>And I never said that if 5 examples were offered, that they became the sole
>definition of the language. It is understood that they were representative
>and not all-encompassing.
>
>If language tags are to be used on the Web, and supported by office tools,
>and to be recognizable by most users (given suitable expanded names and not
>subtags), they should have meanings that typical users can relate to.
>As I have said many times, I understand the need of linguists, and we have
>SIL and more detailed standards with many more entries for their use.
>
>But we should have a clear set (or subset) of tags that most users can work
>with and get what they expect, and where it is unclear, we should be able to
>give them a definition. And their should be a reasonable precision to the
>definition.
>I should be able to examine some text and determine its tag. Looking at your
>example could you determine it was en-GB and not something else, perhaps not
>even in the english family?
>
>We are no longer defining anything that is of use to typical users, and if
>we want en-GB to include the example you offered
>then we have also undermined what users and most of the software industry
>understood the tag to mean.
>
>Now we might agree that the tag always included such text and it was
>tolerated that the example was not supported by spell checkers or text to
>voice devices.
>But as we are expanding the number of tags, we need to guide the industry as
>to their meaning and usage.
>And anyone following these discussions will likely be very confused as to
>the meaning of the tags.
>That does not serve our industry. Or at least not the industry I thought the
>IETF was about.
>
>So what are the criteria for registering a language or prohibiting a
>registration?
>
>tex
>
>
>
>Mark Davis wrote:
>  
>
>>a) We have never had the accuracy that Tex is looking for. "en-US" does
>>not tell me exactly what to expect; it ranges from the English used in
>>"Hee Haw", to rap, to Robert De Niro's English in "Raging Bull". What is
>>the practical problem that having el-Latn causes? The lack of
>>spell-checkers doesn't mean it's an illegitimate tag; I don't have a
>>spell-checker for the following, but it is valid en-GB:
>>
>>Beatrice: Not til God make men of some other mettal then
>>earth, would it not grieue a woman to be ouer-masterd with
>>a peece of valiant dust? to make an account of her life to a clod
>>of waiward marle? no vnckle, ile none: Adams sonnes are my
>>brethren, and truely I holde it a sinne to match in my kin-
>>red. [Much Ado About Nothing (Quarto) 2.1]
>>
>>b). We already have a generative mechanism in 3066 primus. All of the
>>tags at the end of this message work. Is this a problem? Nope.
>>Generative mechanisms have a huge advantage over registrations. If I
>>have an application or a customer that needs haw-DE, I can use that tag
>>right now, without having to wait months or years for it to be registered.
>>
>>The meaning of the tag is clear, even though the precise denotation is
>>not -- but you can *never* get precision anyway. Having a registration
>>citing 5 books doesn't mean that the text is *limited* to precisely what
>>is described in those 5 books (all and only the words in those books,
>>all and only the grammatical constructs and combinations of words in
>>those 5 books)  -- such registrations would be useless.
>>
>>========
>>
>>haw-AD: Hawaiian(Andorra), haw-AE: Hawaiian(United Arab Emirates),
>>haw-AF: Hawaiian(Afghanistan), haw-AG: Hawaiian(Antigua and Barbuda),
>>haw-AI: Hawaiian(Anguilla), haw-AL: Hawaiian(Albania), haw-AM:
>>Hawaiian(Armenia), haw-AN: Hawaiian(Netherlands Antilles), haw-AO:
>>Hawaiian(Angola), haw-AQ: Hawaiian(Antarctica), haw-AR:
>>Hawaiian(Argentina), haw-AS: Hawaiian(American Samoa), haw-AT:
>>Hawaiian(Austria), haw-AU: Hawaiian(Australia), haw-AW: Hawaiian(Aruba),
>>haw-AX: Hawaiian(Aland Islands), haw-AZ: Hawaiian(Azerbaijan), haw-BA:
>>Hawaiian(Bosnia and Herzegovina), haw-BB: Hawaiian(Barbados), haw-BD:
>>Hawaiian(Bangladesh), haw-BE: Hawaiian(Belgium), haw-BF:
>>Hawaiian(Burkina Faso), haw-BG: Hawaiian(Bulgaria), haw-BH:
>>Hawaiian(Bahrain), haw-BI: Hawaiian(Burundi), haw-BJ: Hawaiian(Benin),
>>haw-BM: Hawaiian(Bermuda), haw-BN: Hawaiian(Brunei), haw-BO:
>>Hawaiian(Bolivia), haw-BQ: Hawaiian(British Antarctic Territory),
>>haw-BR: Hawaiian(Brazil), haw-BS: Hawaiian(Bahamas), haw-BT:
>>Hawaiian(Bhutan), haw-BV: Hawaiian(Bouvet Island), haw-BW:
>>Hawaiian(Botswana), haw-BY: Hawaiian(Belarus), haw-BZ: Hawaiian(Belize),
>>haw-CA: Hawaiian(Canada), haw-CC: Hawaiian(Cocos (Keeling) Islands),
>>haw-CD: Hawaiian(Congo (Kinshasa)), haw-CF: Hawaiian(Central African
>>Republic), haw-CG: Hawaiian(Congo (Brazzaville)), haw-CH:
>>Hawaiian(Switzerland), haw-CI: Hawaiian(Ivory Coast), haw-CK:
>>Hawaiian(Cook Islands), haw-CL: Hawaiian(Chile), haw-CM:
>>Hawaiian(Cameroon), haw-CN: Hawaiian(China), haw-CO: Hawaiian(Colombia),
>>haw-CR: Hawaiian(Costa Rica), haw-CS: Hawaiian(Serbia And Montenegro),
>>haw-CT: Hawaiian(Canton and Enderbury Islands), haw-CU: Hawaiian(Cuba),
>>haw-CV: Hawaiian(Cape Verde), haw-CX: Hawaiian(Christmas Island),
>>haw-CY: Hawaiian(Cyprus), haw-CZ: Hawaiian(Czech Republic), haw-DD:
>>Hawaiian(East Germany), haw-DE: Hawaiian(Germany), haw-DJ:
>>Hawaiian(Djibouti), haw-DK: Hawaiian(Denmark), haw-DM:
>>Hawaiian(Dominica), haw-DO: Hawaiian(Dominican Republic), haw-DZ:
>>Hawaiian(Algeria), haw-EC: Hawaiian(Ecuador), haw-EE: Hawaiian(Estonia),
>>haw-EG: Hawaiian(Egypt), haw-EH: Hawaiian(Western Sahara), haw-ER:
>>Hawaiian(Eritrea), haw-ES: Hawaiian(Spain), haw-ET: Hawaiian(Ethiopia),
>>haw-FI: Hawaiian(Finland), haw-FJ: Hawaiian(Fiji), haw-FK:
>>Hawaiian(Falkland Islands), haw-FM: Hawaiian(Micronesia), haw-FO:
>>Hawaiian(Faroe Islands), haw-FQ: Hawaiian(French Southern and Antarctic
>>Territories), haw-FR: Hawaiian(France), haw-FX: Hawaiian(Metropolitan
>>France), haw-GA: Hawaiian(Gabon), haw-GB: Hawaiian(United Kingdom),
>>haw-GD: Hawaiian(Grenada), haw-GE: Hawaiian(Georgia), haw-GF:
>>Hawaiian(French Guiana), haw-GH: Hawaiian(Ghana), haw-GI:
>>Hawaiian(Gibraltar), haw-GL: Hawaiian(Greenland), haw-GM:
>>Hawaiian(Gambia), haw-GN: Hawaiian(Guinea), haw-GP:
>>Hawaiian(Guadeloupe), haw-GQ: Hawaiian(Equatorial Guinea), haw-GR:
>>Hawaiian(Greece), haw-GS: Hawaiian(South Georgia and the South Sandwich
>>Islands), haw-GT: Hawaiian(Guatemala), haw-GU: Hawaiian(Guam), haw-GW:
>>Hawaiian(Guinea-Bissau), haw-GY: Hawaiian(Guyana), haw-HK: Hawaiian(Hong
>>Kong S.A.R., China), haw-HM: Hawaiian(Heard Island and McDonald
>>Islands), haw-HN: Hawaiian(Honduras), haw-HR: Hawaiian(Croatia), haw-HT:
>>Hawaiian(Haiti), haw-HU: Hawaiian(Hungary), haw-ID: Hawaiian(Indonesia),
>>haw-IE: Hawaiian(Ireland), haw-IL: Hawaiian(Israel), haw-IN:
>>Hawaiian(India), haw-IO: Hawaiian(British Indian Ocean Territory),
>>haw-IQ: Hawaiian(Iraq), haw-IR: Hawaiian(Iran), haw-IS:
>>Hawaiian(Iceland), haw-IT: Hawaiian(Italy), haw-JM: Hawaiian(Jamaica),
>>haw-JO: Hawaiian(Jordan), haw-JP: Hawaiian(Japan), haw-JT:
>>Hawaiian(Johnston Island), haw-KE: Hawaiian(Kenya), haw-KG:
>>Hawaiian(Kyrgyzstan), haw-KH: Hawaiian(Cambodia), haw-KI:
>>Hawaiian(Kiribati), haw-KM: Hawaiian(Comoros), haw-KN: Hawaiian(Saint
>>Kitts and Nevis), haw-KP: Hawaiian(North Korea), haw-KR: Hawaiian(South
>>Korea), haw-KW: Hawaiian(Kuwait), haw-KY: Hawaiian(Cayman Islands),
>>haw-KZ: Hawaiian(Kazakhstan), haw-LA: Hawaiian(Laos), haw-LB:
>>Hawaiian(Lebanon), haw-LC: Hawaiian(Saint Lucia), haw-LI:
>>Hawaiian(Liechtenstein), haw-LK: Hawaiian(Sri Lanka), haw-LR:
>>Hawaiian(Liberia), haw-LS: Hawaiian(Lesotho), haw-LT:
>>Hawaiian(Lithuania), haw-LU: Hawaiian(Luxembourg), haw-LV:
>>Hawaiian(Latvia), haw-LY: Hawaiian(Libya), haw-MA: Hawaiian(Morocco),
>>haw-MC: Hawaiian(Monaco), haw-MD: Hawaiian(Moldova), haw-MG:
>>Hawaiian(Madagascar), haw-MH: Hawaiian(Marshall Islands), haw-MI:
>>Hawaiian(Midway Islands), haw-MK: Hawaiian(Macedonia), haw-ML:
>>Hawaiian(Mali), haw-MM: Hawaiian(Myanmar), haw-MN: Hawaiian(Mongolia),
>>haw-MO: Hawaiian(Macao S.A.R., China), haw-MP: Hawaiian(Northern Mariana
>>Islands), haw-MQ: Hawaiian(Martinique), haw-MR: Hawaiian(Mauritania),
>>haw-MS: Hawaiian(Montserrat), haw-MT: Hawaiian(Malta), haw-MU:
>>Hawaiian(Mauritius), haw-MV: Hawaiian(Maldives), haw-MW:
>>Hawaiian(Malawi), haw-MX: Hawaiian(Mexico), haw-MY: Hawaiian(Malaysia),
>>haw-MZ: Hawaiian(Mozambique), haw-NA: Hawaiian(Namibia), haw-NC:
>>Hawaiian(New Caledonia), haw-NE: Hawaiian(Niger), haw-NF:
>>Hawaiian(Norfolk Island), haw-NG: Hawaiian(Nigeria), haw-NI:
>>Hawaiian(Nicaragua), haw-NL: Hawaiian(Netherlands), haw-NO:
>>Hawaiian(Norway), haw-NP: Hawaiian(Nepal), haw-NQ: Hawaiian(Dronning
>>Maud Land), haw-NR: Hawaiian(Nauru), haw-NT: Hawaiian(Neutral Zone),
>>haw-NU: Hawaiian(Niue), haw-NZ: Hawaiian(New Zealand), haw-OM:
>>Hawaiian(Oman), haw-PA: Hawaiian(Panama), haw-PC: Hawaiian(Pacific
>>Islands Trust Territory), haw-PE: Hawaiian(Peru), haw-PF:
>>Hawaiian(French Polynesia), haw-PG: Hawaiian(Papua New Guinea), haw-PH:
>>Hawaiian(Philippines), haw-PK: Hawaiian(Pakistan), haw-PL:
>>Hawaiian(Poland), haw-PM: Hawaiian(Saint Pierre and Miquelon), haw-PN:
>>Hawaiian(Pitcairn), haw-PR: Hawaiian(Puerto Rico), haw-PS:
>>Hawaiian(Palestinian Territory), haw-PT: Hawaiian(Portugal), haw-PU:
>>Hawaiian(U.S. Miscellaneous Pacific Islands), haw-PW: Hawaiian(Palau),
>>haw-PY: Hawaiian(Paraguay), haw-PZ: Hawaiian(Panama Canal Zone), haw-QA:
>>Hawaiian(Qatar), haw-QO: Hawaiian(Outlying Oceania), haw-RE:
>>Hawaiian(Reunion), haw-RO: Hawaiian(Romania), haw-RU: Hawaiian(Russia),
>>haw-RW: Hawaiian(Rwanda), haw-SA: Hawaiian(Saudi Arabia), haw-SB:
>>Hawaiian(Solomon Islands), haw-SC: Hawaiian(Seychelles), haw-SD:
>>Hawaiian(Sudan), haw-SE: Hawaiian(Sweden), haw-SG: Hawaiian(Singapore),
>>haw-SH: Hawaiian(Saint Helena), haw-SI: Hawaiian(Slovenia), haw-SJ:
>>Hawaiian(Svalbard and Jan Mayen), haw-SK: Hawaiian(Slovakia), haw-SL:
>>Hawaiian(Sierra Leone), haw-SM: Hawaiian(San Marino), haw-SN:
>>Hawaiian(Senegal), haw-SO: Hawaiian(Somalia), haw-SR:
>>Hawaiian(Suriname), haw-ST: Hawaiian(Sao Tome and Principe), haw-SU:
>>Hawaiian(Union of Soviet Socialist Republics), haw-SV: Hawaiian(El
>>Salvador), haw-SY: Hawaiian(Syria), haw-SZ: Hawaiian(Swaziland), haw-TC:
>>Hawaiian(Turks and Caicos Islands), haw-TD: Hawaiian(Chad), haw-TF:
>>Hawaiian(French Southern Territories), haw-TG: Hawaiian(Togo), haw-TH:
>>Hawaiian(Thailand), haw-TJ: Hawaiian(Tajikistan), haw-TK:
>>Hawaiian(Tokelau), haw-TL: Hawaiian(East Timor), haw-TM:
>>Hawaiian(Turkmenistan), haw-TN: Hawaiian(Tunisia), haw-TO:
>>Hawaiian(Tonga), haw-TR: Hawaiian(Turkey), haw-TT: Hawaiian(Trinidad and
>>Tobago), haw-TV: Hawaiian(Tuvalu), haw-TW: Hawaiian(Taiwan), haw-TZ:
>>Hawaiian(Tanzania), haw-UA: Hawaiian(Ukraine), haw-UG: Hawaiian(Uganda),
>>haw-UM: Hawaiian(United States Minor Outlying Islands), haw-US:
>>Hawaiian(United States), haw-UY: Hawaiian(Uruguay), haw-UZ:
>>Hawaiian(Uzbekistan), haw-VA: Hawaiian(Vatican), haw-VC: Hawaiian(Saint
>>Vincent and the Grenadines), haw-VD: Hawaiian(North Vietnam), haw-VE:
>>Hawaiian(Venezuela), haw-VG: Hawaiian(British Virgin Islands), haw-VI:
>>Hawaiian(U.S. Virgin Islands), haw-VN: Hawaiian(Vietnam), haw-VU:
>>Hawaiian(Vanuatu), haw-WF: Hawaiian(Wallis and Futuna), haw-WK:
>>Hawaiian(Wake Island), haw-WS: Hawaiian(Samoa), haw-YD:
>>Hawaiian(People's Democratic Republic of Yemen), haw-YE:
>>Hawaiian(Yemen), haw-YT: Hawaiian(Mayotte), haw-ZA: Hawaiian(South
>>Africa), haw-ZM: Hawaiian(Zambia), haw-ZW: Hawaiian(Zimbabwe)
>>
>>Harald Tveit Alvestrand wrote:
>>
>>    
>>
>>>--On onsdag, september 28, 2005 04:19:18 -0700 Tex Texin
>>><tex at xencraft.com> wrote:
>>>
>>>      
>>>
>>>>It's a good reason not to Register generative tags.
>>>>
>>>>So when someone requests a tag now, what is the reviewer to look at?
>>>>
>>>>We used to identify a few representative books, which I always thought
>>>>meant we were identifying a particular set of rules around the language
>>>>(spelling, orthography).
>>>>
>>>>The registration for el-Latn more or less stipulates the need for
>>>>transliteration, mentions that they exist, with  a link to a site that
>>>>collects transliteration systems. (Which btw, I think is a really bad
>>>>idea
>>>>in the event the site goes away or completely changes its list of
>>>>reference materials.) But it doesn't really nail down what it is. (It
>>>>mentions a standard, but doesn't say the tag is referring to that
>>>>particular standard.
>>>>
>>>>So we are no longer identifying a reference or a particular language,
>>>>but
>>>>just the concept that there seems to be something like a language of
>>>>this
>>>>persuasion. I guess we were asking for this with es-419. (Which I was
>>>>also a proponent of.)
>>>>        
>>>>
>>>Yep.
>>>
>>>For this particular request, one reason why I don't care much what the
>>>registration says is that we have a community consensus that anyone
>>>using a tag consisting of a 639 language and a script doesn't have to
>>>register it - he can just use it if he thinks it's appropriate for his
>>>content.
>>>
>>>So we'd better get used to a world where we encounter tags of that
>>>nature.
>>>
>>>Translation is not a new subject for this list - transliteration
>>>warranted a section in Peter Constable's 2002 SIL report
>>><http://www.unicode.org/notes/tn8/SILEWP2002-003.pdf>, and I've found
>>>mention of transliteration in the archives in 1998. (Google doesn't
>>>seem to search the old archives consistently, however.. I'd better
>>>move them....)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>_______________________________________________
>>>Ietf-languages mailing list
>>>Ietf-languages at alvestrand.no
>>>http://www.alvestrand.no/mailman/listinfo/ietf-languages
>>>
>>>
>>>
>>>      
>>>
>
>  
>



More information about the Ietf-languages mailing list