Duplicate Busters: Survey #1
Doug Ewell
doug at ewellic.org
Thu Jul 31 06:59:37 CEST 2008
This is the first of two surveys that are being distributed to both the
LTRU and ietf-languages groups, to determine how to resolve two
different types of duplicate Description information in the IANA
Language Subtag Registry. The results will affect the contents of the
future (RFC 4646bis) Registry, and some of the results may affect the
current (RFC 4646) Registry as well.
I feel it is important to involve both the LTRU (rule-making) and
ietf-languages (rule-applying) groups in this process, to maximize the
chances that the same rules will be applied both now and in the future
when similar conflicts arise.
This survey deals with exact duplicates, cases where two different
subtags or tags have the same Description in the current or proposed
Registry. For each case, I provide the two (or three) conflicting
records, along with suggested changes that would resolve the conflict in
a way consistent with other Registry entries, and a brief discussion
explaining the proposed change. Please send your comments to the list,
either supporting the suggested changes or offering alternative
solutions. When the rate of list responses slows sufficiently, I will
make decisions based on list feedback and apply them in the next
iteration of draft-4645bis.
All supporting encyclopedic information on languages is from Ethnologue,
or from Change Request Forms used to add the language or change its
name(s) in ISO 639-3. Please do not use this survey as an opportunity
to criticize Ethnologue or ISO 639-3 as general sources of language
information or classification. Our objective is simply to find a
sensible and consistent way to distinguish two different languages with
the same name. Anyone who has information from another source that
contradicts or augments the information below is asked to share this
information with the list, to help us make the best possible naming
choices.
===
Type: language
Subtag: aru
Description: Aruá
--> REPLACE WITH: Aruá (Arauan)
Description: Arawá
Added: 2029-09-09
Type: language
Subtag: arx
Description: Aruá
--> REPLACE WITH: Aruá (Monde)
Added: 2029-09-09
Discussion:
Both of these are listed in Ethnologue as "a language of Brazil," so
region cannot be used as a distinguishing feature. 'aru' is extinct,
but this is also not a good attribute by which to distinguish the two,
especially since 'arx' has only 12 speakers itself. 'aru' is listed as
a member of the Arauan language family and 'arx' as a member of the
Monde family, so this seems to be the best choice.
===
Type: language
Subtag: awb
Description: Awa
--> REPLACE WITH: Description: Awa (Papua New Guinea)
Added: 2029-09-09
Type: language
Subtag: vwa
Description: Awa
--> REPLACE WITH: Description: Awa (China)
Added: 2029-09-09
Discussion:
Region for 'awb' is as listed in Ethnologue. Region for 'vwa' is as
indicated on ISO 639-3 Change Request Form 2007-010.
===
Type: language
Subtag: bwo
Description: Boro
Description: Borna
--> REPLACE WITH: Description: Borna (Ethiopia)
Added: 2029-09-09
Type: language
Subtag: bxx
Description: Borna
--> REPLACE WITH: Description: Borna (Democratic Republic of Congo)
Added: 2029-09-09
Discussion:
Regions are as listed in Ethnologue. The form "Democratic Republic of
Congo" (without "the") is consistent with that used with other languages
in ISO 639-3. The 639-3 code element 'bwo' is currently the subject of
a change request, to change the name "Boro" to "Boro (Ethiopia)" in
anticipation of another Boro being added, thereby confirming that we are
headed down the right path with this renaming effort.
===
Type: language
Subtag: diq
Description: Dimli
--> REPLACE WITH: Description: Dimli (individual language)
Added: 2029-09-09
Macrolanguage: zza
Type: language
Subtag: kiu
Description: Kirmanjki
--> REPLACE WITH: Description: Kirmanjki (individual language)
Added: 2029-09-09
Macrolanguage: zza
Type: language
Subtag: zza
Description: Zaza
Description: Dimili
Description: Dimli
--> REPLACE WITH: Description: Dimli (macrolanguage)
Description: Kirdki
Description: Kirmanjki
--> REPLACE WITH: Description: Kirmanjki (macrolanguage)
Description: Zazaki
Added: 2006-08-24
Scope: macrolanguage
Discussion:
These three cases are handled together due to their commonality. Both
Dimli and Kirmanjki are individual languages encompassed within Zaza,
which may also be called Dimli or Kirmanjki, neatly exemplifying the
concept of a macrolanguage. The strings "individual language" and
"macrolanguage" are used extensively in 639-3 for this purpose; see, for
example, Dogri.
===
Type: language
Subtag: he
Description: Hebrew
--> NO CHANGE
Added: 2005-10-16
Suppress-Script: Hebr
Type: language
Subtag: iw
Description: Hebrew
--> NO CHANGE
Added: 2005-10-16
Deprecated: 1989-01-01
Preferred-Value: he
Suppress-Script: Hebr
Discussion:
No change is proposed to this pair of Description fields since 'iw' is
deprecated with a Preferred-Value of 'he'.
===
Type: language
Subtag: id
Description: Indonesian
--> NO CHANGE
Added: 2005-10-16
Suppress-Script: Latn
Type: language
Subtag: in
Description: Indonesian
--> NO CHANGE
Added: 2005-10-16
Deprecated: 1989-01-01
Preferred-Value: id
Suppress-Script: Latn
Discussion:
No change is proposed to this pair of Description fields since 'in' is
deprecated with a Preferred-Value of 'id'.
===
Type: language
Subtag: jv
Description: Javanese
--> NO CHANGE
Added: 2005-10-16
Type: language
Subtag: jw
Description: Javanese
--> NO CHANGE
Added: 2005-10-16
Deprecated: 2001-08-13
Preferred-Value: jv
Comments: published by error in Table 1 of ISO 639:1988
Discussion:
No change is proposed to this pair of Description fields since 'jw' is
deprecated with a Preferred-Value of 'jv'.
===
Type: language
Subtag: mtf
Description: Murik
--> REPLACE WITH: Description: Murik (Papua New Guinea)
Added: 2029-09-09
Type: language
Subtag: mxr
Description: Murik
--> REPLACE WITH: Description: Murik (Malaysia)
Added: 2029-09-09
Discussion:
Regions are as listed in Ethnologue.
===
Type: language
Subtag: ji
Description: Yiddish
--> NO CHANGE
Added: 2005-10-16
Deprecated: 1989-01-01
Preferred-Value: yi
Type: language
Subtag: yi
Description: Yiddish
--> NO CHANGE
Added: 2005-10-16
Suppress-Script: Hebr
Scope: macrolanguage
Discussion:
No change is proposed to this pair of Description fields since 'ji' is
deprecated with a Preferred-Value of 'yi'.
===
Type: region
Subtag: AA
Description: Private use
--> NO CHANGE
Added: 2005-10-16
Type: region
Subtag: QM..QZ
Description: Private use
--> NO CHANGE
Added: 2005-10-16
Type: region
Subtag: XA..XZ
Description: Private use
--> NO CHANGE
Added: 2005-10-16
Type: region
Subtag: ZZ
Description: Private use
--> NO CHANGE
Added: 2005-10-16
Discussion:
No change is proposed to these four Description fields since they have
the same semantic and since private-use subtags listed in the Registry
must be handled specially anyway. The only way to disambiguate these
values would be to use something like "Private use-AA", which would be
completely arbitrary and would not solve the special-handling problem.
The question of whether to unpack the "range" records into individual
records (currently being discussed on LTRU) is orthogonal to the present
survey.
===
Type: grandfathered
Tag: i-hak
Description: Hakka
--> NO CHANGE
Added: 1999-01-31
Deprecated: 2000-01-10
Preferred-Value: hak
Type: grandfathered
Tag: zh-hakka
Description: Hakka
--> NO CHANGE
Added: 1999-12-18
Deprecated: 2029-09-09
Preferred-Value: hak
Discussion:
No change is proposed to this pair of Description fields since both are
deprecated with a Preferred-Value equal to the language subtag 'hak',
which has a Description of "Hakka Chinese".
--
Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
More information about the Ietf-languages
mailing list