Duplicate Busters: Survey #1

Doug Ewell doug at ewellic.org
Thu Jul 31 06:59:37 CEST 2008


This is the first of two surveys that are being distributed to both the 
LTRU and ietf-languages groups, to determine how to resolve two 
different types of duplicate Description information in the IANA 
Language Subtag Registry.  The results will affect the contents of the 
future (RFC 4646bis) Registry, and some of the results may affect the 
current (RFC 4646) Registry as well.

I feel it is important to involve both the LTRU (rule-making) and 
ietf-languages (rule-applying) groups in this process, to maximize the 
chances that the same rules will be applied both now and in the future 
when similar conflicts arise.

This survey deals with exact duplicates, cases where two different 
subtags or tags have the same Description in the current or proposed 
Registry.  For each case, I provide the two (or three) conflicting 
records, along with suggested changes that would resolve the conflict in 
a way consistent with other Registry entries, and a brief discussion 
explaining the proposed change.  Please send your comments to the list, 
either supporting the suggested changes or offering alternative 
solutions.  When the rate of list responses slows sufficiently, I will 
make decisions based on list feedback and apply them in the next 
iteration of draft-4645bis.

All supporting encyclopedic information on languages is from Ethnologue, 
or from Change Request Forms used to add the language or change its 
name(s) in ISO 639-3.  Please do not use this survey as an opportunity 
to criticize Ethnologue or ISO 639-3 as general sources of language 
information or classification.  Our objective is simply to find a 
sensible and consistent way to distinguish two different languages with 
the same name.  Anyone who has information from another source that 
contradicts or augments the information below is asked to share this 
information with the list, to help us make the best possible naming 
choices.

===

Type: language
Subtag: aru
Description: Aruá
--> REPLACE WITH: Aruá (Arauan)
Description: Arawá
Added: 2029-09-09

Type: language
Subtag: arx
Description: Aruá
--> REPLACE WITH: Aruá (Monde)
Added: 2029-09-09

Discussion:
Both of these are listed in Ethnologue as "a language of Brazil," so 
region cannot be used as a distinguishing feature.  'aru' is extinct, 
but this is also not a good attribute by which to distinguish the two, 
especially since 'arx' has only 12 speakers itself.  'aru' is listed as 
a member of the Arauan language family and 'arx' as a member of the 
Monde family, so this seems to be the best choice.

===

Type: language
Subtag: awb
Description: Awa
--> REPLACE WITH: Description: Awa (Papua New Guinea)
Added: 2029-09-09

Type: language
Subtag: vwa
Description: Awa
--> REPLACE WITH: Description: Awa (China)
Added: 2029-09-09

Discussion:
Region for 'awb' is as listed in Ethnologue.  Region for 'vwa' is as 
indicated on ISO 639-3 Change Request Form 2007-010.

===

Type: language
Subtag: bwo
Description: Boro
Description: Borna
--> REPLACE WITH: Description: Borna (Ethiopia)
Added: 2029-09-09

Type: language
Subtag: bxx
Description: Borna
--> REPLACE WITH: Description: Borna (Democratic Republic of Congo)
Added: 2029-09-09

Discussion:
Regions are as listed in Ethnologue.  The form "Democratic Republic of 
Congo" (without "the") is consistent with that used with other languages 
in ISO 639-3.  The 639-3 code element 'bwo' is currently the subject of 
a change request, to change the name "Boro" to "Boro (Ethiopia)" in 
anticipation of another Boro being added, thereby confirming that we are 
headed down the right path with this renaming effort.

===

Type: language
Subtag: diq
Description: Dimli
--> REPLACE WITH: Description: Dimli (individual language)
Added: 2029-09-09
Macrolanguage: zza

Type: language
Subtag: kiu
Description: Kirmanjki
--> REPLACE WITH: Description: Kirmanjki (individual language)
Added: 2029-09-09
Macrolanguage: zza

Type: language
Subtag: zza
Description: Zaza
Description: Dimili
Description: Dimli
--> REPLACE WITH: Description: Dimli (macrolanguage)
Description: Kirdki
Description: Kirmanjki
--> REPLACE WITH: Description: Kirmanjki (macrolanguage)
Description: Zazaki
Added: 2006-08-24
Scope: macrolanguage

Discussion:
These three cases are handled together due to their commonality.  Both 
Dimli and Kirmanjki are individual languages encompassed within Zaza, 
which may also be called Dimli or Kirmanjki, neatly exemplifying the 
concept of a macrolanguage.  The strings "individual language" and 
"macrolanguage" are used extensively in 639-3 for this purpose; see, for 
example, Dogri.

===

Type: language
Subtag: he
Description: Hebrew
--> NO CHANGE
Added: 2005-10-16
Suppress-Script: Hebr

Type: language
Subtag: iw
Description: Hebrew
--> NO CHANGE
Added: 2005-10-16
Deprecated: 1989-01-01
Preferred-Value: he
Suppress-Script: Hebr

Discussion:
No change is proposed to this pair of Description fields since 'iw' is 
deprecated with a Preferred-Value of 'he'.

===

Type: language
Subtag: id
Description: Indonesian
--> NO CHANGE
Added: 2005-10-16
Suppress-Script: Latn

Type: language
Subtag: in
Description: Indonesian
--> NO CHANGE
Added: 2005-10-16
Deprecated: 1989-01-01
Preferred-Value: id
Suppress-Script: Latn

Discussion:
No change is proposed to this pair of Description fields since 'in' is 
deprecated with a Preferred-Value of 'id'.

===

Type: language
Subtag: jv
Description: Javanese
--> NO CHANGE
Added: 2005-10-16

Type: language
Subtag: jw
Description: Javanese
--> NO CHANGE
Added: 2005-10-16
Deprecated: 2001-08-13
Preferred-Value: jv
Comments: published by error in Table 1 of ISO 639:1988

Discussion:
No change is proposed to this pair of Description fields since 'jw' is 
deprecated with a Preferred-Value of 'jv'.

===

Type: language
Subtag: mtf
Description: Murik
--> REPLACE WITH: Description: Murik (Papua New Guinea)
Added: 2029-09-09

Type: language
Subtag: mxr
Description: Murik
--> REPLACE WITH: Description: Murik (Malaysia)
Added: 2029-09-09

Discussion:
Regions are as listed in Ethnologue.

===

Type: language
Subtag: ji
Description: Yiddish
--> NO CHANGE
Added: 2005-10-16
Deprecated: 1989-01-01
Preferred-Value: yi

Type: language
Subtag: yi
Description: Yiddish
--> NO CHANGE
Added: 2005-10-16
Suppress-Script: Hebr
Scope: macrolanguage

Discussion:
No change is proposed to this pair of Description fields since 'ji' is 
deprecated with a Preferred-Value of 'yi'.

===

Type: region
Subtag: AA
Description: Private use
--> NO CHANGE
Added: 2005-10-16

Type: region
Subtag: QM..QZ
Description: Private use
--> NO CHANGE
Added: 2005-10-16

Type: region
Subtag: XA..XZ
Description: Private use
--> NO CHANGE
Added: 2005-10-16

Type: region
Subtag: ZZ
Description: Private use
--> NO CHANGE
Added: 2005-10-16

Discussion:
No change is proposed to these four Description fields since they have 
the same semantic and since private-use subtags listed in the Registry 
must be handled specially anyway.  The only way to disambiguate these 
values would be to use something like "Private use-AA", which would be 
completely arbitrary and would not solve the special-handling problem. 
The question of whether to unpack the "range" records into individual 
records (currently being discussed on LTRU) is orthogonal to the present 
survey.

===

Type: grandfathered
Tag: i-hak
Description: Hakka
--> NO CHANGE
Added: 1999-01-31
Deprecated: 2000-01-10
Preferred-Value: hak

Type: grandfathered
Tag: zh-hakka
Description: Hakka
--> NO CHANGE
Added: 1999-12-18
Deprecated: 2029-09-09
Preferred-Value: hak

Discussion:
No change is proposed to this pair of Description fields since both are 
deprecated with a Preferred-Value equal to the language subtag 'hak', 
which has a Description of "Hakka Chinese".


--
Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ



More information about the Ietf-languages mailing list