Anomaly in upcoming registry
gerard.lang at insee.fr
Tue Jun 30 17:31:03 CEST 2009
I completely agree with this position.
In fact, the situation concerning sh and connex code elements inside ISO 639 is more than exceptional.
1-ISO recommendation R 639 (November 1967) includes the code element "Sh" as symbol for the language (name) "Serbo-Croat/Serbo-croate", having indexes "861/862", that is part of the slavonic language family, inside the UDC (Universal Decimal Codification) and combination "Sh/YU/" with the country (name) symbol concerning Yugoslavia.
2-ISO 639 standard (1988-04-01) includes, amonge of its 136 initial alpha-2 code elements, the alpha-2 code element "sh" for the representation of the language name "Srpskohrvatski/serbo-croate/Serbo-Croatian", as well as "hr" for the language name "Hrvatski/croate/Croatian" and "sr" for the language name "Srpski/serbe/Serbian"..
There is no entry concerning the language name "Bosnian"
3-ISO 639-2 (1998-10-22), whose introduction writes "The languages (names) listed in ISO 639-1are a subset of the languages (names) listed in this part of ISO 639; every language code (element) in the two letters code set has a corresponding language code (element) in the alpha-3 list, but not necessarily vice-versa.", includes an alpha-3 (binary) entry "scr/hrv" for the language name "Croatian/croate", and also another (binary) alpha-3 entry "scc/srp" for the language name "Serbian/serbe", but does not include any alpha-3 code element for an entry language name "Serbo-Croatian/serbo-croate", so that the promiss given inside ISO 639-2 introduction is not fulfilled concerning the ISO 639(-1) alpha-2 code element "sh" that has no alpha-3 ISO 639-2 counterpart. There is no entry concerning the language name "Bosnian".
4-Between 1992 and 1993, four (Croatia, Slovenia, Bosnia-Herzegovina and Macedonia) of the six Republics that were formerly united inside the Socialist Federal Republic of Yugoslavia acquired independance, became Member States of the United Nations and received alpha-2 ISO 3166-1 code elements.
5-On 2000-02-18, ISO 639/RA-JAC decided to deprecate the ISO 639(-1) alpha-2 code element "sh" "because there were separate language code (elements) for each language (name) represented (Serbian, Croatian and then Bosnian was added.).
6-On the same day, ISO 639/RA-JAC decided the addition of a new alpha-2 code element "bo" inside ISO 639 and of a new alpha-3 code element "bos" inside ISO 639-2 to represent the language name "Bosnian/ bosniaque" (along with the addition of 24 others entries inside ISO 639-2 on the same day; but among them only "Sign languages" received only an alpha-3 ISO 639-2 code element "sgn" and no alpha-2 ISO 639-1 code element).
7-Nevertheless, ISO 639-1 (2002-07-18) reintegrated the alpha-2 code element "sh" representing the language name "srpskohrvatski (jezik)/ serbo-croate/ Serbo-Croatian)" as an entry, along with the three others code elements "bs", "hr" and "sr" for representing respectively the three language names "bosanski (jezik)/ bosniaque/ Bosnian", "hrvatski jezik/ croate/ Croatian" and "srpski (jezik)/ serbe/ Serbian".
8-On 2005-YY-XX, after this reintegration inside ISO 639-1, the language name "Serbo-Croat" was never included as a new entry inside ISO 639-2, and in 2005 (no more precise datation given) the ISO 639/RA-JAC decided to "reaffirm the deprecated status of "sh" inside ISO 639-1
9-ISO 639-3 (2007-02-05) includes the following entries "bos" (Bosnian), "hbs" (Serbo-Croatian, explicitely linked to "sh" [deprecated]), "hrv" (Croatian), "srp" (Serbian), as well as "mkd" (Macedonian, with also "mk" [and CDU index 866 inside the slavic language family]already inside ISO 639: 1988) and "slv" (Slovenian, with also CDU index 863 inside the slavic language family] already inside ISO 639: 1988; moreover, the language name "Slovenian" has the symbol "Sn" and the combination "Sn/YU" inside ISO R 639 (1967).
10-On 2008-04-07, the Croatian National and University Library, the Croatian Standard Institute, the National Library of Serbia and the Institute for Standardization of Serbia jointly wrote a letter to the ISO 639-2 Registration Authority and to the ISO Central Secretary to explain that the alpha-3 ISO 6392/B code elements "scr" and "scc", that were abreviations for "Serbo-Croatian written in Roman alphabet" and "Serbo-Croatian written in cyrillic alphabet" should no more be used and that the corresponding alpha-3 ISO 639-2/T code elements "hrv" and "srp" should replace them to represent respectively the Serbian and Croatian language names inside ISO 639-2.
On 2008-06-28, ISO 639/RA-JAC accepted this claim and decided to deprecate "scc" and "scr" respectively in favor of "hrv" and "srp".
11-ISO 639-5 (2008-05-15) includes the entry "sla", that an alpha-3 ISO 639-5 code element to represent the family language name "Slavic languages (remainder group)", that is under the hierarchy of the alpha-3 ISO 639-5 code element "ine" to represent the family language name "Indo-European (remainder group)."
De : ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] De la part de Mark Davis ?
Envoyé : lundi 29 juin 2009 23:36
À : John Cowan
Cc : ietf-languages at iana.org; ISO639-3 at sil.org
Objet : Re: Anomaly in upcoming registry
Good point; the target for 639-1/2 is different, and the threshold for deprecation is different. And given this conversation, I think it is pretty clear that we should un-deprecate sh in the registry; we are following 639-3 in not being restrictive about the codes we add (understatement) to the registry, they considered the issue of deprecating hbs (=sh) and decided not to, so we should follow their lead.
On Mon, Jun 29, 2009 at 13:22, John Cowan <cowan at ccil.org> wrote:
Mark Davis â?? scripsit:
> hbs = sh, yet
> hbs is not Deprecated, and
> sh is Deprecated
It's actually worse than that. hbs in 639-2 is deprecated ("retired"),
but hbs in 639-3 is not deprecated.
> We could take ISO 639-3 as superseding 639-1 on the issue of deprecation,
> and I think that would be the right thing to do. However, it would be
> cleaner yet if ISO 639-1 were to un-deprecate sh, so that it was consistent
> with ISO 639-3.
For "639-1" read "639-1 and 639-2". But there's a policy question here:
coding a language in -1 or -2 is a policy decision, not merely a technical
one: it involves an explicit value judgement on which languages are considered
important enough to get -1 codes or membership in the -2 set. The various RAs
reserve the right, it seems to me, to change their minds about this (as we
reserve the right to ignore it when they remove codes).
My corporate data's a mess! John Cowan
It's all semi-structured, no less. http://www.ccil.org/~cowan <http://www.ccil.org/%7Ecowan>
But I'll be carefree cowan at ccil.org
On an XML DBMS.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages