Ietf-languages Digest, Vol 74, Issue 1
Anthony Aristar
aristar at linguistlist.org
Sun Feb 22 16:11:50 CET 2009
Well, it's interesting to know the background for this set. But it
raises a recurrent issue for ISO standards. More
than once in the past I've seen a standard promulgated; yet the
explanation for the oddities of that
standard are known only to those who happen to be in a select group.
This means that standards generate
a kind of contempt from those who are outsiders.
The official term for 639-5 is, after all: "Codes for the Representation
of Names of Languages. Part 5: Alpha-3
code for language families and groups" If what John says is true -- and
I am quite willing to believe it is -- then
this title is not merely misleading, but erroneous. This is not, it
seems, what these codes are actually doing.
Furthermore, the idea of "vague" codes is a very useful one. There are
indeed situations where you would
like to tag a data-set as just "North American.Indian". But the
solution is not to produce a code-set that
is so internally confused about whether it refers to geographical
regions or linguistic ones that it is more
likely to generate derision than acceptance.
John Cowan wrote:
> [Quoted fragments have been reordered]
>
> Anthony Aristar scripsit:
>
>
>> [T]he code-set is a mish-mash that is very reminiscent of the mess
>> that ISO 639-1/2 were before ISO 639-3 came along [...].
>>
>
> Not surprising: it's the same mish-mash, just with additional codes for
> some well-known groupings.
>
> The purpose of 639-5, at least in connection with BCP 47, is to make it
> possible to tag documents whose language has not been determined exactly.
> It allows vagueness. You may not know the exact language of a document,
> but perhaps you at least know that it is written in a North American
> Indian language, so you can tag it "nai" or perhaps "nai-Latn" or
> "nai-fonipa" to add information about the transcription. That gives
> someone classifying or retrieving the document something more to go on
> that a flat "und" or other indicator of absence.
>
> For classification, it doesn't much matter if a group is genetic or not.
> Indeed, genetic groupings may be singularly unhelpful, not to mention
> unstable, in parts of the world where the relationships between languages
> are not yet firmly established. And in the BCP 47 world, we value
> stability at least slightly higher than truth.
>
>
>> [T]he use of Alpha-3 makes the codes easily confusable with ISO
>> 639-3 . I know of at least one project that simply wont use them
>> because of this.
>>
>
> Whereas that is very convenient for BCP 47 purposes: the "primary
> language" subtag can be a collection, a macrolanguage, or an individual
> language without having to have variable syntax (except for the use of
> 639-1 two-letter codes, which is retained for backward compatibility).
>
>
>> [ISO 639-5] is, like the original 639-1, so small as to be relatively
>> useless. The fact that it can be expanded through the normal change
>> process is not very useful: it will take a *LONG* time to get
>> everything in that we as linguists need.
>>
>
> It's simply not meant for use by linguists.
>
>
>> [S]ome of the names used are enough to make linguists cringe.
>>
>
> True enough.
>
> A cocky novice once said to Stallman: "I can guess why the editor
> is called Emacs, but why is the justifier called Bolio?". Stallman
> replied forcefully, "Names are but names. 'Emack & Bolio's' is the
> name of a popular ice-cream shop in Boston-town. Neither of these
> men had anything to do with the software."
>
> His question answered, yet unanswered, the novice turned to go,
> but Stallman called to him, "Neither Emack nor Bolio had anything
> to do with the ice-cream shop, either."
>
> This is generally known as the ice-cream koan.
>
>
--
**************************************
Anthony Aristar, Director, Institute for Language & Information Technology
Professor of Linguistics Moderator, LINGUIST Linguistics Program
Dept. of English aristar at linguistlist.org
Eastern Michigan University 2000 Huron River Dr, Suite 104
Ypsilanti, MI 48197
U.S.A.
URL: http://linguistlist.org/aristar/
**************************************
More information about the Ietf-languages
mailing list