It's my impression that, in MARC, the categories were considered to be an exception. At this point in ISO 639, I think the collections cannot be treated as partitions: the additions in 639-3 basically undermine that assumption. However, it might still be possible to treat the *collections* as a partition.

None of that changes the intention of mis in MARC, which I'd have to say is the same intention in ISO 639: mis is the "else" case - i.e. none of the other coded categories fits. The problem with mis is one of maintainability: any time a new entry is added in ISO 639, some documents tagged with mis may suddenly become inappropriately tagged.


You are right about number one; I concede that point fully -- I'd overlooked that phrasing, sorry. That means that in ISO 639-2 the following are collections:

mul             Multiple languages
art             Artificial (Other)

plus more normal cases:

afa             Afro-Asiatic (Other)
alg             Algonquian languages

and the following are not collections.

und             Undetermined
zxx             No linguistic content

I disagree with your point number two. The only reason that I can conclude that "Miscellaneous Expenses" in a spreadsheet excludes the other listed categories is that I know that everything listed is intended to be a partition. The language codes clearly do not form a partition, since some collections encompass other codes. The collection codes that are tagged with (Other) are clearly meant to be the remainder of partitions, but for the collections that are not tagged with (Other) there is no evidence that they were intended to exclude other cases -- if anything, the contrary -- if they had meant to be the remainder of partitions, they would have said (Other).

Moreover, while some may be perfectly willing to have stability go by the wayside, it is extremely important to us, and that is one of the guiding principles of and reasons for BCP 47. That means that if I validly and correctly tag content with "mis", that application cannot be made incorrect by any future change to BCP 47. That is why we can broaden the application of codes, but cannot narrow them. There is no evidence in BCP 47 that I cannot correctly tag the content "kind" with "mis". Now, of course, we all know that we should tag with as much information as we can, so I *should* tag that with "en" if I know I mean the English word, and "de" if I mean the German, or possibly others. If I don't know which one it is, if my protocol allows multiple tags I can use "en, de", and if not I am forced into a choice between "mul" or choosing the 'most likely' language of the set.

Mark Davis scripsit:

> Saying that mis is a collection is not breaking, but also not
> substantiated by ISO 639-2.

It's just unfathomable to me how you can get that reading of the standard.

> http://www.loc.gov/standards/iso639-2/normtext.html

which says (section 4.1.1, second sentence):

        The words *languages* or *(other)* as part of a language name
        in the following tables may be taken to indicate that a language
        code is a collective language code.

> http://www.loc.gov/standards/iso639-2/php/code_list.php

which says (s.v. "mis")

        Miscellaneous languages

Note the word *languages*.  Game, set, and match (or Q, E, and D).

> Saying that "which don't belong to any other collection" is a breaking
> change, *and* is not substantiated by ISO 639-2 at the time the code was
> added to the registry (or even now).

Come, come.  Do you expect us, the members of this list, to suppose
that when a spreadsheet contains the line "Miscellaneous expenses"
you will find there charges for capital construction or salaries?
You will not.  You will find expenses *that do not fit into any
other category* on the spreadsheet.

