Lang Gérard gerard.lang at
Mon Sep 29 15:12:35 CEST 2008

I completely agree with the proposition of Mark Davis concerning the (sub-)tagging of romanization systems by reference to the "responsible agencies".
In fact, many examples are about romanization of geographic names by specialized agencies (because cartography, that began long time ago in latin, italian, french, english, german, dutch, spanish, portuguese..., as well for the land part of the earth that for its maritime part, evidently asked and still asks for romanization of geographical names)  like:
-The US BGN (Board on Geographic Names), and also the National ;
-The GB PCGN (Permanent Committee on Geographical Names);
-The FR IGN (Institut Géographique national);
-The UN GEGN (United nations Group of Experts  on Geographical Names), whose work is used and referenced inside ISO 3166, that recently published:
*Manual for the standardization of geographical names (Manual M 88; February 2006), and
*Technical reference manual for the standardization of geographical names Manual M 87; March 2007


	De : ietf-languages-bounces at [mailto:ietf-languages-bounces at] De la part de Mark Davis
	Envoyé : lundi 29 septembre 2008 10:47
	À : ietf-languages at
	Some people apparently thought at first that the hpin1958 was a joke -- it wasn't. I was trying to be consistent with the tack taken on acad1958. People from the 'broad pinyin' camp are claiming that there are a set of romanizations that follow the same principles, and that thereby should have the broad term 'pinyin'. No evidence or pointers to documentation of those principles have yet followed, so there is as yet no reason to think that that would be a good approach. 

	*However*, an alternative approach to romanization systems could end up with the same result, but be defined on a sounder basis. That would be to use such subtags not as representing some (ill-defined or undefined) principles, but denoting 'responsible agencies'. For example, we could have any of:

	*	ru-Latn-bgn - A romanization of Russian defined by the U.S. Board on Geographic Names 
	*	ru-Latn-ungegn - A romanization of Russian defined by the UNITED NATIONS GROUP OF EXPERTS ON GEOGRAPHICAL NAMES (UNGEGN) 
	*	ru-Latn-gost - A romanization of Russian defined by the GOST standards, now administered by the Euro-Asian Council for Standardization, Metrology and Certification (EASC)

	This could be a principled approach to romanization subtags (and other transliteration subtags), since it would be clear what the combination of language+script+subtag would denote in such cases case. That is, the 'ru' could be replaced by arbitrary other language subtags, and the meaning of the resulting tag would still be well defined. (It might have an empty denotation, like bo-Cher-AQ, but the semantics would be well defined.) Each of ungegn, gost, bgn, etc would point to a particular agency in its registration.

	And I think such a strategy for subtag registration would be a reasonable one. One could even combine that with year subtags to indicate revisions, where necessary. For example, 

	he-Latn-ungegn - any of the Hebrew romanization systems defined by the UNITED NATIONS GROUP OF EXPERTS ON GEOGRAPHICAL NAMES (UNGEGN).

	he-Latn-ungegn-2003 : the 2003 version of the UNGEGN romanization
	he-Latn-ungegn-2008 : the 2008 version of the UNGEGN romanization

	So, consistent with that, we could define the subtag 'pinyin' as being one of a set of romanizations defined by the Chinese government, and have not only 


	but also, according to whatever standards the Chinese government publishes:


	Taking that path, it would probably be best not to have specify language prefixes in the registration form, but rather in the Description note that 'pinyin' should be combined with a language subtag and 'Latn' to indicate a romanization for that language according to Chinese government standards, since they could be extended over time (we would not want a precedent that would end up having 50+ different Prefixs for bgn, or ungegn, or ...).

	That would then be a methodology that would make sense to me for going forward with pinyin.

	On Fri, Sep 26, 2008 at 1:40 PM, Mark Davis <mark at> wrote:

		I've produced a modified version of my R3, with the following changes: 

		- change of the subtag name for consistency with Michael's approach on Belerusian
		- addition of zh-Latn prefix (as discussed on the list)
		- some additional information on the letters used in this system.


		1. Name of requester: 
		Mark Davis
		2. E-mail address of requester: 
		markdavis at
		3. Record Requested:
		Type: variant
		Subtag: hpin1958
		Description: Hanyu Pinyin romanization of Mandarin Chinese
		Prefix: zh-Latn
		4. Intended meaning of the subtag:
		To distinguish Mandarin Chinese content written in Latin characters using the Hanyu Pinyin romanization (as adopted by China in 1958) from the other possible transcriptions. It uses the Latin letters [aáàǎā b-d eéèěē f-h iíìǐī j-n oóòǒō p-t uúùǔū üǘǜǚǖ w-z]: that is, a-z, minus v, plus ü and 4 additional accented versions of each of the vowels.
		5. Reference to published description of the language (book or article):
		Hanyu Pinyin, the most commonly used system for Mandarin Chinese romanization, has been the national standard of China since 1958, and an international standard (ISO 7098:1991, 2nd ed.) since 1982. 

		See also the LOC page for the relation between Hanyu Pinyin and Wade-Giles:
		6. Any other relevant information:

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Ietf-languages mailing list