Doug Ewell doug at
Mon Dec 19 02:17:28 CET 2016

Luc Pardon wrote:

> BCP47 violates that rule big time, by packing all kind of things
> (script, orthography, ...) in a field that was originally intended (in
> HTML) to contain only the language.

John has already addressed the historical context -- that is, the world 
did not begin with HTML.

Additionally, John alluded to the fact that there has always been a 
sense among language tag users, going back to at least the 1980s, that 
some critical language-tagging distinctions go beyond language alone. 
"Simplified Chinese" and "Traditional Chinese" have always needed 
different resource sets, different spell-checkers, different parameters 
for searching and sorting. Canadian French, Swiss French, Belgian 
French, and Hexagonal French have their differences. Twenty years ago 
our translators in Quebec and Paris waged a mighty war over the use of 
"taux" versus "niveau" to translate a "level" of laboratory control 

Of course, once we replaced the one-off registration of language-region 
and language-script pairs with a generative mechanism in BCP 47, we did 
open the door for arbitrary combinations. But this is not a simple, 
easily dismissed matter of putting unrelated items into a single field, 
the way that (as David said) putting multiple language tags into a 
single language-tag field would be.

> But you can tag the document as a whole with "lang=ru" and then
> proceed to tag the notes or citations separately as "lang=fr". Problem
> solved. Your search will bring up the annotated copy along with all
> the other Russian-language copies/editions of the book.

Spanglish isn't like this, though. Neither is Franglais or Tagalish or 
Hinglish or the other combinations. One of the identifying features of 
these hybrids is that vocabulary from each language is mixed so freely 
that declaring one of the contributing languages to be the "base" and 
the others to be exceptions, like "caramba" or "oy vey" in the middle of 
an otherwise all-English text, both misses the point and requires 
ridiculous amounts of markup.

> Or would it be enough if the Queen keeps saying "Caramba, off with
> their heads" to make it Spanglish, even if that is the only Spanish
> word in the entire book? Or would that then be "Englanish"?

See above. This would be English with a single Spanish word, which 
normally wouldn't even be tagged. That's not at all what Spanglish and 
friends are.

Doug Ewell | Thornton, CO, US | 

More information about the Ietf-languages mailing list