wadegile and pinyin LANGUAGE SUBTAG REGISTRATION FORMs

Wed Sep 3 16:24:24 CEST 2008

From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Michael Everson
Sent: Tuesday, August 26, 2008 6:31 AM

>> Is that to say you approve them with a Prefix value of
>> "zh-Latn", as shown on Mark's "R2" registration forms?
>
> Erm, no. Both Wade Giles and Hanyu Pinyin imply Latin
> inherently, in my opinion.

I think we all agree that Latin is implied. Chinese is also implied. By this rationale, a complete tag of "wadegile" would work just as well as "zh-wadegile" (BCP47 syntax requirements aside). In terms of semantic representation, that is true: "wadegile" contains just as much information as does "zh-wadegile".

But in processing operations, they are not equal: having a separate subtag denoting the 'Chinese' semantic (or, in a 4646bis era, the 'Mandarin' semantic) makes it easy for processes to recognize that without needing to have tables recording the relationship between "wadegiles" and "zh". In just the same way, including "Latn" frees processes from needing to have tables recording the relationship between "wadegiles" and "Latn".

We need to consider how tags will get used -- in matching -- together with the matching algorithms described in BCP47 (RFC 4647). Realistic scenarios include

- matching a request for "zh-Latn" content with content tagged to indicate Wade Giles or Hanyu Pinyin Romanizations

- matching a request for Wade Giles or Pinyin content with the best-available match, which may be content tagged "zh-Latn"

Those are made more complicated if "Latn" is not part of the prefix for "wadegile" and "pinyin".

Peter