Mark Davis ☕️ mark at
Fri Jan 6 18:08:28 CET 2017


On Fri, Jan 6, 2017 at 1:54 PM, Michael Everson <everson at>

> I think there is no requirement for an extensible mechanism to “admix” any
> two random languages.

​The value of a generative mechanism is that one doesn't have to anticipate
ahead of time what users find useful. Take ​BCP47 itself. If it turns out
that some one finds en-JP useful, then it can be represented without
waiting for a registration process — which may not succeed just because a
registrar finds "en-JP" not meaningful or useful.

The key for such a mechanism to work is that the semantics can be derived
from the components (plus syntax).

Moreover, while es-spanglis is user-friendly (for the bibliographers and
> librarians who are likely the ONLY users of such tags), this es-t-h0-en is
> just a load of mysterious letters.

​BCP47 is not intended to be for end users; that is nice where possible,
but not a requirement. The syntax (limitation to 8 ascii-only alphanum)
puts restrictions on readability; but BCP47 is meant for internal codes.
Any good interface will supply a human readable name. end-user ought to see
a human-readable name in their language. Pick a random point in the
language subtag registry to get something like 'cmm'. What percentage of
 bibliographers know right away that that means "Michigamea"?

And it is NOT the ONLY user of such TAGS (as long as you are shouting),
which you apparently didn't read from my Jan 5 email. We are not focused on
the tagging of content side as much as the selection of content.

> 1) contact languages are NOT “transformations” so your -t- makes no sense

​There is already discussion of this in the email.

> 2) what the heck is h-zero supposed to mean? Oh. hybrid zero. And there’s
> a hybrid one. And what’s that?​

> Nobody will know.

​People will know who read the documentation.

> > es-t-h0-en    Spanglish       Spanish with an admixture of English
> > en-t-h0-es    Spanglish       English with an admixture of Spanish
> > Note: the boundary between these two will be rather fuzzy, like most
> cases in identifying. We'd recommend that es-t-h0-en be used unless English
> clearly predominates.
> No, we wouldn’t. We’d recommend that the dominant language, whether en or
> es, be used, whichever it may be.

​The "unless predominates" is meant to signify the ~50% case. Unless you
have a clause like that, content that is about 50:50 doesn't have a clear

> > One could then also have
> >
> > es-t-hi-h0-en Spanglish translated from Hindi
> > A second key 'h1' is defined indicating that the source language for
> transform is a hybrid, much has we have done with the transliteration s0
> and d0 keys. The value of h1 is a language tag that indicating that the
> source language for -t- is a hybrid with that language, allowing
> formulations like
> >
> > es-t-hi-h1-en Spanish translated from Hinglish
> > es-t-hi-h0-en-h1-en   Spanglish translated from Hinglish
> You’ve got to be kidding.
> This is clever, Mark, but it doesn’t address any actual user requirements,
> and the notation you propose is absurdly opaque.

​Again, see opaqueness above.

> > Hybrid locales
> Locales? There’s no Spanglish locale envisioned.

​By you. You don't happen to be the only user of BCP47.

> > have intermixed content from 2 (or more) languages, often with one
> language's grammatical structure applied to words in another. See also ​
> for the use of the
> term “hybrid”.
> > More importantly, it doesn't work for a very common use case: locale
> selection. To communicate requests for localized content and
> internationalization services, locales are used, which are an extension of
> language tags. When people pick a language from a menu, internally they are
> picking a locale (en-GB, es-419, etc). If you want an application to
> support Spanglish or Hinglish, then you have to have a locale to represent
> that.
> I don’t think anybody wants to do this.

We have had concrete requests from product groups within Google for hybrid
locale identifiers, and not just one or two of them. This is not a whim.

Note that this does not prevent 'spanglis' from being registered. If the
use of an extension is too opaque for you, go for it. It just doesn't meet
our needs.

> > Luckily, this falls within the scope of the T extension.
> Not usefully.
> Michael
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Ietf-languages mailing list