[Ltru] Re: "mis" update review request
petercon at microsoft.com
Sat Apr 14 01:03:20 CEST 2007
It's my impression that, in MARC, the categories were considered to be an exception. At this point in ISO 639, I think the collections cannot be treated as partitions: the additions in 639-3 basically undermine that assumption. However, it might still be possible to treat the *collections* as a partition.
None of that changes the intention of mis in MARC, which I'd have to say is the same intention in ISO 639: mis is the "else" case - i.e. none of the other coded categories fits. The problem with mis is one of maintainability: any time a new entry is added in ISO 639, some documents tagged with mis may suddenly become inappropriately tagged.
From: Mark Davis [mailto:mark.davis at icu-project.org]
Sent: Friday, April 13, 2007 11:50 AM
To: John Cowan
Cc: Frank Ellermann; LTRU Working Group; ietf-languages at alvestrand.no
Subject: [Ltru] Re: "mis" update review request
You are right about number one; I concede that point fully -- I'd overlooked that phrasing, sorry. That means that in ISO 639-2 the following are collections:
mul Multiple languages
art Artificial (Other)
plus more normal cases:
afa Afro-Asiatic (Other)
alg Algonquian languages
and the following are not collections.
zxx No linguistic content
I disagree with your point number two. The only reason that I can conclude that "Miscellaneous Expenses" in a spreadsheet excludes the other listed categories is that I know that everything listed is intended to be a partition. The language codes clearly do not form a partition, since some collections encompass other codes. The collection codes that are tagged with (Other) are clearly meant to be the remainder of partitions, but for the collections that are not tagged with (Other) there is no evidence that they were intended to exclude other cases -- if anything, the contrary -- if they had meant to be the remainder of partitions, they would have said (Other).
Moreover, while some may be perfectly willing to have stability go by the wayside, it is extremely important to us, and that is one of the guiding principles of and reasons for BCP 47. That means that if I validly and correctly tag content with "mis", that application cannot be made incorrect by any future change to BCP 47. That is why we can broaden the application of codes, but cannot narrow them. There is no evidence in BCP 47 that I cannot correctly tag the content "kind" with "mis". Now, of course, we all know that we should tag with as much information as we can, so I *should* tag that with "en" if I know I mean the English word, and "de" if I mean the German, or possibly others. If I don't know which one it is, if my protocol allows multiple tags I can use "en, de", and if not I am forced into a choice between "mul" or choosing the 'most likely' language of the set.
On 4/13/07, John Cowan <cowan at ccil.org<mailto:cowan at ccil.org>> wrote:
Mark Davis scripsit:
> Saying that mis is a collection is not breaking, but also not
> substantiated by ISO 639-2.
It's just unfathomable to me how you can get that reading of the standard.
which says (section 4.1.1, second sentence):
The words *languages* or *(other)* as part of a language name
in the following tables may be taken to indicate that a language
code is a collective language code.
which says (s.v. "mis")
Note the word *languages*. Game, set, and match (or Q, E, and D).
> Saying that "which don't belong to any other collection" is a breaking
> change, *and* is not substantiated by ISO 639-2 at the time the code was
> added to the registry (or even now).
Come, come. Do you expect us, the members of this list, to suppose
that when a spreadsheet contains the line "Miscellaneous expenses"
you will find there charges for capital construction or salaries?
You will not. You will find expenses *that do not fit into any
other category* on the spreadsheet.
One art / There is John Cowan < cowan at ccil.org<mailto:cowan at ccil.org>>
No less / No more http://www.ccil.org/~cowan
All things / To do
With sparks / Galore -- Douglas Hofstadter
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Ietf-languages