proposed ISO standard for language variations

Doug Ewell doug at
Tue May 10 19:44:26 CEST 2016

Don Osborn wrote:

> Per Yury's comment on 'i+1-th level' subtags, I would add that it does
> seem that the project (in the grand sense) of standardization as
> regards languages focuses more on distinctions within languages (in
> the case of the L2/16-131 proposal, "down to the language variety of
> an individual speaker"), the broad utility of which is hard to see,
> outside perhaps of description. Not to say such distinctions are not
> useful - they can be of course, but where the need arises. So I tend
> to agree with Yury's conclusion.

The use case of tagging language all the way down to the individual user
or document was actually part of a failed attempt to derail the RFC 4646
project over a decade ago. This level of detail is interesting in
certain fields of linguistic research, but not for identifying or
locating content in the BCP 47 sense. If you want to find samples of
English just as Joyce wrote it in "Ulysses," you wouldn't use a language
tag; you'd search for "Ulysses."

So I agree that portions of the work item might find use within a BCP 47
extension, but not that one.

> What is missing I think is systematic attention to the '(i-1)th level'
> - or perhaps '(i-0.5)th level' - where mutual intelligibility, common
> phonetics, similar structure, and shared vocabulary may, and in many
> cases does, make linguistic boundaries fade. It is at the this level
> that communication happens but mostly outside the description of the
> coding system. Yes, macrolanguages are in this space, but my
> understanding of that category is that it was forced by the need to
> accommodate certain established ISO 639-1/2 categories broader than
> what were identified for ISO 639-3, and as such was never extended to
> other logical candidates. An example of the latter is Kinyarwanda and
> Kirundi, which are close enough that I am told that speakers of one
> understand the other, and that a recent job announcement called for a
> translator of "Kinyarwanda or Kirundi" (implying a functional
> equivalence of the two for whatever their needs were).

Recognizing that such cases exist and identifying the lowest-hanging
examples are the easy part. Getting agreement on the criteria for
grouping in less-obvious cases is another matter. Consider the Swedish
government's argument that Elfdalian is a dialect of Swedish.

Ethnologue groups languages by families, in a hierarchy that can run 20
or more levels deep, but the groupings change frequently, aren't really
a standard, and would have to be finessed in any case (Kinyarwanda and
Rundi are in different 12th-level families). ISO 639-5 identifies
language families and groups, but not the individual languages that
belong to them, so that's out.

"Macrolanguage" is a 639-3 term with a specific meaning and it would be
a fine idea not to try to extend or redefine it.

Doug Ewell | | Thornton, CO 🇺🇸

