xx-XX-nnnn vs. xx-nnnn in Chinese and German

John Cowan jcowan@reutershealth.com
Wed, 13 Feb 2002 16:22:17 -0500

Torsten Bronger wrote:

> In this context: The RFC 3066 says that these tags should be interpreted
> as "one token".  I understand this so that a software should
> understand the whoule tag or nothing.  Is this a good approach?  If
> "fallbacks" were allowed, I'd see no problem with "overtagging" texts.

But we do allow partial matches:  Section 2.5 introduces the
concept of the language-range.  Thus "de", considered as a range,
matches the language tags "de", "de-de", "de-at", "de-at-1996",
etc. etc.  So no, depending on the application, one does not need to
understand the whole tag.

Common sense has to be applied here: an audio application told to
filter all resources that are not in "zh" may allow recorded speech
in a language unintelligible to the client, because understanding
zh-yue does not suffice to understand zh-guoyu nor vice versa.

>             Language   Subform   Orthography
> de           German       ?           ?
> de-DE        German    Germany        ?
> de-AT        German    Austria        ?
> de-DE-1996   German    Germany      "new"
> de-AT-1996   German    Austria      "new"
> de-DE-1901   German    Germany      "old"
> de-AT-1901   German    Austria      "old"
> de-1996      German       ?         "new"
> de-1901      German       ?         "new"

I agree with this schema.

