Preferred Values for Irregular Tags

Wed Jan 20 19:47:09 CET 2010

On 20 Jan 2010, at 18:36, Mark Davis ☕ wrote:

> The grandfathered tags behave differently than anything else. All  
> the other tags are productive: you can combine them in different  
> ways with expected results, while the grandfathered tags are atomic;  
> you can't combine one of them with, say, a region. Moreover, you can  
> write APIs to deal with that structure, returning the base language  
> code, script code, etc. The uniformity of program APIs is of extreme  
> importance when you are dealing with massive amounts of program code.
>
> Of course we could parse en-GB-oed. But it doesn't fit into the  
> regular ABNF production rules, and so doesn't work well in APIs.

So you can do it.

> Out of the billions of possible language tags (without even counting  
> combinations using variants), there are literally only a handful of  
> grandfathered codes (that cannot be correctly mapped to regular  
> language tags). If we can fix these few, then there is nothing  
> standing in the way of everyone being able to use all of them  
> effectively.
>
> That is, for existing data, we (and others like us) would convert  
> tags like en-GB-oed on input to regular tags; then the information  
> is still accessible. Otherwise our only choices are to dump the data  
> or map to the 'closest' code.

So you can do it, but you're threatening to "dump the data" if you  
don't get what you propose?

I'm finding it hard to find anything positive in what you have  
proposed. Perhaps there is something positive there, but I am not  
seeing it.

Michael Everson * http://www.evertype.com/