ISO 639-5 reconfirmation ballot (long)
petercon at microsoft.com
Mon Jul 18 22:26:46 CEST 2016
The lack of documentation about relationships between "collections" and individual languages is a problem that Gary Simons and I called out about 15 years ago. The problem was there in 639-2 before 639-5 was even conceived. And looking at the MARC Language Code List, which was the key source for 639-2, doesn't help a whole lot. From a dialectology perspective, it's all a bit of a mess; but if the librarians find it useful, then it's a useful mess. (Note that the librarians did not ask for and, AFAIK, do not use the additional collections added in 639-5.)
So, IMO, it is what is it, just leave it be, and don't make any great (or even modest) expectations of it. If someone decides we really need a more comprehensive coding system for collections, then perhaps an extension using a new system not constrained to alpha-3 IDs in the 639 ID space would be the best approach.
From: Ietf-languages [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Doug Ewell
Sent: Saturday, July 16, 2016 8:20 PM
To: ietf-languages <ietf-languages at iana.org>; Anthony Aristar <anthony at aristar.org>
Subject: Re: ISO 639-5 reconfirmation ballot (long)
To the extent that Anthony is arguing that ISO 639-5 language collections can't be correlated with individual languages, I certainly agree that that is a problem.
To pick one example, ISO 639-5 provides the following hierarchy for [cmc], "Chamic languages":
map : poz : pqw : cmc
This denotes the following relationship:
[map] Austronesian languages
+-- [poz] Malayo-Polynesian languages
+-- [pqw] Western Malayo-Polynesian languages
+-- [cmc] Chamic languages
But there's no way to look up what individual languages are contained within [cmc]. For that matter, we can't tell except by exhaustive scanning whether [cmc] contains other, lower-level collections.
I don't know if this can realistically be solved; see my earlier comment about Ethnologue attempting to keep track of their own hierarchy, and changing the relationships with some frequency. Still, I can see that it limits the usefulness of the collections. Going back to my example of tagging something as "Hmong-Mien languages," that might not help if there is no common agreement on the members of the set of Hmong-Mien languages.
I'm not quite as sympathetic to why it is such a problem that collection codes cannot be easily distinguished at sight from individual language codes. I'm sure I'm missing something obvious here.
Doug Ewell | Thornton, CO, US | ewellic.org
Ietf-languages mailing list
Ietf-languages at alvestrand.no
More information about the Ietf-languages