Suggestion: registration of variant subtags for Aluku, Ndyuka, and Pamaka

Lang Gérard gerard.lang at insee.fr
Mon Jan 26 15:49:10 CET 2009


Dear Doug,

1-You are entirely correct for arithmetic; t is not q, so that 17 is not 20. I hope you will excuse my error on this calculus !
The fact that "eur", that was associated with so-called europanto, cannot now be reused as an alpha-3 ISO 639-3 code element was my primary target, even if this does not reduce the used number of the alpha-3 code elements, as you very adequately signal. And I also agree that you should count the 130 retired alpha-3 code elements and also the 114 ISO 639-5 code elements (but these sligth and useful corrections of the numbers do not really change the size of the problems). 
When reasoning with "visual association", no letter is better than another one, even if I certainlyagree that most "good ones code elements" are first taken (but this is another question !) for  questions of letter frequency distribution (but the frequencies are perhaps not the sames if we use different languages, even in romanized forms ?).
2-The alpha-2 code element "MF" is associated with the country name "SAINT-MARTIN (FRENCH PART)", so that M is for Saint-Martin, and F for French part. 
And if SINT-MAARTEN, that will soon no more be part of Netherlands Antilles, becomes a new entry inside ISO 3166-1, a code element like NM (N for Netherlands and M for Maarten) would be very good to see the separation of the territory of this islands betweem two sovereignties ! 
Bien cordialement.
Gérard LANG 
-----Message d'origine-----
De : Doug Ewell [mailto:doug at ewellic.org] 
Envoyé : lundi 26 janvier 2009 14:51
À : ietf-languages at iana.org
Cc : Lang Gérard
Objet : Re: Suggestion: registration of variant subtags for Aluku, Ndyuka,and Pamaka

"Lang Gérard" <gerard dot lang at insee dot fr> wrote:

> The number of reserved tags from qaa to qtz is 26.17= 442, so that 17
> 576 - 442 = 17 134 possibilities are remaining  available to assign 
> codes to about 7 700 languages

There are 20 letters from 'a' through 't', not 17.

> (I recently concurred to reduce their number, by obtaining from ISO
> 639-3 the deprecation of the non-existent "europanto (eur)" !)

This doesn't reduce their number, because the MA doesn't reuse code values.  "eur" is no longer available for use by any language name. 
This is true for the other 130+ code elements retired by ISO 639-3 since its inception.  (ISO 639-3 "retires" code elements, they don't deprecate
them.)

I also forgot to count 114 code elements used by ISO 639-5 from the same 3-letter code space, which further reduces the choices of available code elements.  So I'm not completely OK with my arithmetic either; I underestimated the problem.

> Nevertheless, this result says that the ratio of occupation from 7 700 
> to 150134 is less than 51 /100, that does not very much support  the 
> thesis that a systematic visual association between the reference name 
> of each language and the code element  for the representation of this 
> language name is generally impossible (I do not write that this is 
> always possible, maybe a few benign exceptions remain necessary). On 
> the contrary, my position is that, by making sometimes astute choices, 
> this task can be rendered possible.

Letter frequency distribution in language names -- the fact that 't' and 'n' and 'r' are used much more frequently that 'q' and 'x' and 'z' -- would tend to argue against your position.  Most of the good ones are already taken.

> So, we [ISO 3166-1/MA] are left with 676 - (43 + 51) = 582 really 
> available code elements, to be compared with 246 active entries, and
> 51 more not reusable code elements, that gives a ratio of 297/582 = 
> 51/100 that is exactly comparable with the ratio for alpha-3 ISO 639-3 
> code elements !

So you can appreciate the nature of the problem, especially when a new country comes along whose name begins with a common letter like 'M' or 'S', and you have to resort to something like "MF" for "Saint-Martin."

--
Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ^



More information about the Ietf-languages mailing list