ISO 639-5 reconfirmation ballot (long)

Anthony Aristar anthony at aristar.org
Sat Jul 16 22:42:50 CEST 2016


As everyone here has said, in one way or another, language 
classification is fraught with difficulty... even more so than the 
definition of a language, which has caused so much trouble within ISO 
639-3.  I seriously doubt we will ever have a set of universally 
accepted families and subgroups, in fact.

This does not mean that a universally accepted set of codes is 
impossible to achieve, for this is an entirely different matter. It 
simply entails an acceptance that some of those codes will refer to 
subgrouping hypotheses that few or no one accepts any more.

Subgroups represent hypotheses about relationships, not God's Truth.  As 
such, they are always correct, within their little universe, even if no 
one has accepted them since 1790.  You don't withdraw such codes.  You 
may at worst deprecate them.

Once you start building trees you realize very quickly that 
code-elements for groupings can't simply be assigned individually.  They 
have to be assigned to a tree, or they make no sense.  This is the 
biggest problem with ISO 639-5.   Someone clearly just grabbed a number 
of families and subgroups or whatever they thought would be useful, 
stuck them into a list, and then added  a hierarchy so minimal and so 
vague that you aren't really sure that is going on.  They are trying, as 
much as possible,  to treat them exactly as we do individual languages. 
As a result, they make no coherent sense in and of themselves. Add to 
this the fact that there are so few of them, and they are of very 
minimal use.  An archivist, for example, who is collecting data on South 
American languages finds them almost completely useless, and will use 
something else.  And this is an actual example, by the way... not an 
invented one, for I know an archive that does exactly this.

Finally... even though a machine would have no trouble disentangling a 
language code from a subgrouping code, human beings do.  And they are 
reluctant to use something that confuses them.

In sum?  By all means confirm ISO 639-5.  I suppose it's doing no great 
harm.  But don't get the idea that many -- and especially linguists -- 
will ever use it much.  You can't just expand it, for the premise is 
wrong.  You need something better.

On 7/16/2016 2:58 PM, Doug Ewell wrote:
> Apologies for length.
>
> John Cowan wrote:
>
>> If [ISO 639-5] were withdrawn, our Registry would remain unchanged.
>> Our only obligation with respect to ISO 639-5 is to add any subtags
>> that the RA (the Library of Congress) should decide to add.  If the RA
>> ceases to exist, we don't have to do anything.  Certainly we wouldn't
>> remove any of the codes from the former standard.
>
> While BCP 47 says what to do when a core-standard code element is 
> withdrawn -- deprecate the corresponding subtag -- I don't believe it 
> has any provision for an entire standard to be withdrawn or its RA or 
> MA to disband. I would assume the same rule would have to apply: the 
> subtags would be deprecated, which as always means "discouraged but 
> still valid," and possibly even "preferred in certain contexts" 
> (ยง3.1.7). I doubt any of this would cause the earth to stop spinning.
>
> I will say that, from the standpoint not of pure linguistics but of 
> users of language tags, whose need is to identify and search for 
> content, the idea of withdrawing ISO 639-5 seems excessive.
>
> Language classification is always fraught with disagreement, macro and 
> micro. There are numerous ways to classify languages, and different 
> approaches meet the needs of different constituencies. Linguists don't 
> always need what historians need. It's unlikely that "Eastern 
> Hemisphere languages" and "Western Hemisphere languages" would be of 
> use to anyone, but there is no one of the existing schemes that serves 
> everyone's needs either.
>
> Language classifications are imprecise because languages and our 
> understanding of them are imprecise. John wrote:
>
>> "Language" is a concept with a pretty strong basis in fact, though
>> there are edge cases and politicized questions.  "Language family" is
>> a purely theoretical construct and subject to constant change.
>
> but in fact there are frequent debates and uncertainty over both 
> languages and groupings. Every year 639-3/RA gets requests to add 
> newly identified languages, delete non-existent ones, merge two or 
> more into one, split one into two or more, and create macrolanguages 
> (whatever the requester thinks "macrolanguage" means). There is 
> uncertainty within grouping schemes as well. Ethnologue maintains a 
> tree of language families (http://www.ethnologue.com/browse/families) 
> and earlier this year they moved a dozen Austronesian languages out of 
> the "Malayo-Polynesian" subgroup and into subgroups of their own. They 
> make changes like this several times a year. Fortunately for your 
> Co-Designated Expert, BCP 47 does not try to keep up with them!
>
> As I understand it, the goal of 639-2 was to provide coding for every 
> known language, within a single code space, and with the constraint 
> that they couldn't all be enumerated and thousands would have to be 
> covered by collection code elements like "X languages" or, more 
> commonly, "Other X languages." 639-3 did try to enumerate them all, 
> but the use case for collection codes did not disappear. Sometimes one 
> knows that content is in (say) some Hmong-Mien language but not which 
> one, and tagging it as "Hmong-Mien languages" is better than not being 
> able to tag it at all. In a case like that, where identification of 
> some sort is paramount, the distinction between an individual language 
> code and a collection code might be irrelevant.
>
> This, again as I understand it, is why 639-5 exists, why its 
> repertoire was expanded to cover all languages instead of just the 
> "leftovers," why it shares a common alpha-3 code space with 639-2 and 
> -3, and why I think that, imperfect though it may be, it should be 
> reconfirmed.
>
> -- 
> Doug Ewell | Thornton, CO, US | ewellic.org
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list