The limit of language codes

Fri Feb 16 21:34:12 CET 2007

Speaking as the restrictive grumbler here:

A HUGE danger with language tags is the temptation, once language tags are 
successful, to cram ever-more information into them - because it's easier 
to extend the information carried in one container than to add another 
container.

I believe the POSIX "locale" was a failure for exactly that reason - it 
tried to jam together language, date format, currency codes and many other 
things in a single entity - it tried to serve a multiplicity of purposes, 
and fulfilled all of them badly.

The current language tag has dragged in information about geographical 
areas and script codes - these additions have had lots of arguments in 
their favour, but their inclusion has made the language tag a MORE 
difficult tool to use for identifying language.

If you want to tag a document as "written by Shakespeare, in 
Stratford-upon-Avon, around 1611", there's absolutely no substitute for 
saying "author = Shakespeare, year = 1611, place = Stratford-upon-Avon". 
Defining a language tag of "en-GB-1611-Shakespeare-Stratford" is an useless 
exercise that is actively harmful to the proper development of tagging 
systems.

Forgetaboutit.

The growling bear now returns to his cave.

              Harald