Pinyin

Michael Everson everson at evertype.com
Wed Sep 17 16:40:20 CEST 2008


Right. So we need to be able to tag data so that people know that the  
language is Mandarin and the specific Romanization is Hanyu Pinyin, a  
Latin-based orthography. However, the orthographic conventions behind  
Pinyin are applied to other Chinese languages, and in a variant used  
in Taiwan.

zh-pinyin
zh-CN-pinyin
zh-Latn-pinyin
zh-Latn-CN-pinyin
These are formally ambiguous as to which Chinese language (in terms of  
the set of languages written in Han characters) it is. However, in  
this registration, I think we ought to SPECIFY that the string zh- 
pinyin refers to Mandarin Chinese in Hanyu Pinyin orthography--not to  
any other form of Chinese nor any other orthography.

zh-cmn-pinyin
zh-cmn-Latn-pinyin
zh-cmn-CN-pinyin
zh-cmn-Latn-CN-pinyin
cmn-pinyin
cmn-Latn-pinyin
cmn-CN-pinyin
cmn-Latn-CN-pinyin
All of these can only mean Mandarin Chinese in Hanyu Pinyin  
romanization; they are not yet permitted but will be (one supposes).  
Other Chinese languages might be listed with 639-3 in due course.

zh-TW-pinyin
zh-Latn-TW-pinyin
This is Tongyong Pinyin orthography, also defaulting to Mandarin  
Chinese language.

zh-cmn-TW-pinyin
zh-cmn-Latn-TW-pinyin
cmn-TW-pinyin
cmn-Latn-TW-pinyin
All of these can only mean Mandarin Chinese in Tongyong Pinyin  
romanization; they are not yet permitted but will be (one supposes).

bo-pinyin
bo-Latn-pinyin
Both of these mean Tibetan language in Tibetan Pinyin romanization (as  
opposed to Wiley for instance).

Peter says he would like the recommended prefix to contain -Latn-.  
Mark said he could live with or without it but thought that "with"  
should be recommended. Should we assist users of this subtag by having  
some redundancy in the registration? At this stage I think that "best  
practice" (with -Latn-) being the only one specified might be  
insufficient.


More information about the Ietf-languages mailing list