LANGUAGE SUBTAG REGISTRATION FORM (R4) - Pinyin

Mark Davis mark at macchiato.com
Fri Sep 26 16:16:17 CEST 2008


Mark


On Fri, Sep 26, 2008 at 3:45 PM, Michael Everson <everson at evertype.com>wrote:

> On 26 Sep 2008, at 12:40, Mark Davis wrote:
>
> > Type: variant
> > Subtag: hpin1958
> > Description: Hanyu Pinyin romanization of Mandarin Chinese
> > Prefix: zh-Latn
>
> I do not prefer this because, as John said yesterday:
>
> >> For every relevant langauge there is a Hanyu-Pinyin-type
> >> orthography; rather than devising individual subtags for all of
> >> these, we devise one, which when applied to Chinese signifies Hanyu
> >> Pinyin.
>
>
> I think the right way to do this is to use "pinyin" in the same way as
> "fonupa" is used, applicable to many languages. When applied to zh
> alone, it should be defined to mean Mandarin (in the absence of being
> able to use cmn). I do not think we should have 40 different pinyin-
> based subtags. It is clear that the Chinese are using the conventions
> of this alphabet for many languages.


As far as I can tell, only you and John think this is a good idea. Having
looked at some of the other romanizations, I see little real commonality in
the systems to the point where "pinyin" would be usefully, and not
misleadingly, applied to them.

I think by far the better course is to register "pinyin" as Hanyu Pinyin
romanization of Latin for Mandarin Chinese, with prefix zh-Latn. If at some
later time it looks reasonable to broaden it, we explicitly permit that in
BCP47, as you know.


>
> I have however now almost been convinced that it is appropriate to put
> "zh-Latn" in the prefix for this and for "wadegile", though I don't
> know what you guys expect software to do when it finds the -Latn-
> omitted, as it surely will. Do you just not care?


In software, you always have to have fallbacks. Someone might write az, and
you don't know whether Latn, Arab, or Cyrl is meant. In lookup, you can use
a broad interpretation (any script). In lookup (eg for a web page), you can
pick the most likely one. That's why in Unicode we have "likely subtags"
data:

http://www.unicode.org/cldr/data/charts/supplemental/likely_subtags.html

But if a product doesn't use this kind of information, then it helps to have
the tag be zh-Latn-pinyin.


>
>
> Michael Everson * http://www.evertype.com
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20080926/80878f15/attachment-0001.htm 


More information about the Ietf-languages mailing list