zh-****-** tags

Doug Ewell dewell at adelphia.net
Sat Mar 19 18:05:03 CET 2005

Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:

>> The fact that these regions happen to belong to CN does not
>> mean that HK and MO are not perfectly acceptable region IDs.
> Sure, use pt-MO or en-HK where needed.  But don't abuse these
> codes in artificial tags for languages which are not directly
> related to MO or HK.

I don't think anyone has proven that Chinese in Macao or Hong Kong does
*not* differ from Chinese in China.  I'm not saying that's a
justification for encoding them, but by the same token, it seems
presumptuous to dismiss the distinction as "artificial."

> The *-MO proposals sound like de-LI-1996
> vs. de-LI-1901.

If de-LI differs from de-{AT, CH, DE}, and if the 1996 Rechtschreibung
was adopted in Liechtenstein, and if a sufficient body of work in de-LI
exists in both orthographies, then this might be a useful distinction.
(Under RFC 3066bis, of course, you could compose such things dynamically
without the need to register or debate them.)

> Or like en-US-NewOrleans vs. en-US-NewYork.

There are certainly dialect-level differences between these two, which
might also be useful to someone.  (The draft would allow users to do
this with -x- private-use subtags, although -NewOrleans would have to be
chopped to fit the Procrustean 8-character limit.)

>> we have tags such as zh-guoyu and zy-yue.
> Yes, and if I understood the ethnologue info correctly the
> latter is not good enough to cover a (hypothetical) zh-SG-yue.
> If that's the case it doesn't automatically justify zh-MO-yue
> (under the 3066 rules).

No tag registration is ever automatic, as we have seen.  Each is
considered on its own merits.

Please, let's get the order right.  If encoded, these would be zh-yue-SG
and zh-yue-MO.

> de-LI, de-BE, and some others are also not unreasonable, quite
> the contrary.  Dito zh-US and others.  But I don't see why
> that's a reason to register all theoretical permutations under
> 3066 rules.

That is the core of the whole debate.  iu-Cans-CA has not been approved
because it seems unlikely that anyone will be able to demonstrate use of
iu-Cans outside of Canada.  Others stand a better chance, assuming the
evidence can be gathered.

>> Apparently you have either just joined the list
> Yes, because the new 3066bis idea of region codes is dubious,

I assume you are alluding to the question of allowing region subtags
based on pre-1995 or pre-1988 country codes.  We should carry on that
debate on LTRU, not here.

> and adding these zh-hanZ-XY tags as grandfathered to a future
> 3066bis registry makes it worse.  Redundant would be only ugly
> and no real problem, but more grandfathered tags are not what
> a future 3066bis and its future implementations need.

These proposed tags would become redundant, not grandfathered, since
they would be composable according to the rules and grammar of 3066bis.

>> script distinctions are almost always going to matter more
>> than regional dialect or spelling variations
> Not if it's a language with a "default" script like Latn.  It's
> not yet clear how 3066bis will solve this problem if at all.

We've discussed the matter of "default" scripts on this list before.
It's actually a difficult concept to define for some languages.

If anything, though, I would think spelling distinctions matter *more*
for a language with a clear default script.  Probably there are a great
many people who can read both az-Latn and az-Cyrl.  The odds of finding
someone who can read de-Arab or de-Deva, however, seem quite remote.

> The last state was that en-Latn-US-boont won't match en-boont.
> Your idea to sort subtags by importance is fine, that could
> result in en-boont-Latn-US, de-1996-LI, zh-yue-SG, etc.

The proposed order has been debated extensively on this list, and is:

extended language

Please, if we are going to propose different orderings (as in
*en-boont-Latn-US or *de-1996-LI), let's discuss them on LTRU.

zh-yue-SG would not be valid under the draft unless extended-language
subtags became operational (probably using ISO 639-3 codes).
Grandfathered zh-yue cannot be combined with anything.

-Doug Ewell
 Fullerton, California

More information about the Ietf-languages mailing list