LANGUAGE SUBTAG REGISTRATION FORM (R4) - Pinyin

Mark Davis mark at macchiato.com
Mon Sep 29 10:46:32 CEST 2008


Some people apparently thought at first that the hpin1958 was a joke -- it
wasn't. I was trying to be consistent with the tack taken on
acad1958. People from the 'broad pinyin' camp are claiming that there are a
set of romanizations that follow the same principles, and that thereby
should have the broad term 'pinyin'. No evidence or pointers to
documentation of those principles have yet followed, so there is as yet no
reason to think that that would be a good approach.
*However*, an alternative approach to romanization systems could end up with
the same result, but be defined on a sounder basis. That would be to use
such subtags not as representing some (ill-defined or undefined) principles,
but denoting 'responsible agencies'. For example, we could have any of:


   - ru-Latn-bgn - A romanization of Russian defined by
   the U.S. Board on Geographic Names
   - ru-Latn-ungegn - A romanization of Russian defined by
   the UNITED NATIONS GROUP OF EXPERTS ON GEOGRAPHICAL NAMES (UNGEGN)
   - ru-Latn-gost - A romanization of Russian defined by the GOST
   standards, now
   administered by the Euro-Asian Council for Standardization,
Metrology and Certification (EASC)

This could be a principled approach to romanization subtags (and other
transliteration subtags), since it would be clear what the combination of
language+script+subtag would denote in such cases case. That is, the 'ru'
could be replaced by arbitrary other language subtags, and the meaning of
the resulting tag would still be well defined. (It might have an empty
denotation, like bo-Cher-AQ, but the semantics would be well defined.) Each
of ungegn, gost, bgn, etc would point to a particular agency in its
registration.

And I think such a strategy for subtag registration would be a reasonable
one. One could even combine that with year subtags to indicate revisions,
where necessary. For example,

he-Latn-ungegn - any of the Hebrew romanization systems defined by
the UNITED NATIONS GROUP OF EXPERTS ON GEOGRAPHICAL NAMES (UNGEGN).

he-Latn-ungegn-2003 : the 2003 version of the UNGEGN romanization
he-Latn-ungegn-2008 : the 2008 version of the UNGEGN romanization

So, consistent with that, we could define the subtag 'pinyin' as being one
of a set of romanizations defined by the Chinese government, and have not
only

zh-Latn-pinyin

but also, according to whatever standards the Chinese government publishes:

ug-Latn-pinyin
bo-Latn-pinyin
mn-Latn-pinyin
...

Taking that path, it would probably be best not to have specify language
prefixes in the registration form, but rather in the Description note that
'pinyin' should be combined with a language subtag and 'Latn' to indicate a
romanization for that language according to Chinese government standards,
since they could be extended over time (we would not want a precedent that
would end up having 50+ different Prefixs for bgn, or ungegn, or ...).

That would then be a methodology that would make sense to me for going
forward with pinyin.

Mark


On Fri, Sep 26, 2008 at 1:40 PM, Mark Davis <mark at macchiato.com> wrote:

> I've produced a modified version of my R3, with the following changes:
> - change of the subtag name for consistency with Michael's approach on
> Belerusian
> - addition of zh-Latn prefix (as discussed on the list)
> - some additional information on the letters used in this system.
>
> ====
>
> LANGUAGE SUBTAG REGISTRATION FORM (R4)
> 1. Name of requester:
>
> Mark Davis
>
> 2. E-mail address of requester:
>
> markdavis at google.com
>
> 3. Record Requested:
>
> Type: variant
> Subtag: hpin1958
> Description: Hanyu Pinyin romanization of Mandarin Chinese
> Prefix: zh-Latn
>
> 4. Intended meaning of the subtag:
>
> To distinguish Mandarin Chinese content written in Latin characters using the Hanyu
> Pinyin romanization (as adopted by China in
> 1958) from the other possible transcriptions. It uses the Latin
> letters [aáàǎā b-d eéèěē f-h iíìǐī j-n oóòǒō p-t uúùǔū üǘǜǚǖ w-z]: that is,
> a-z, minus v, plus ü and 4 additional accented versions of each of the
> vowels.
>
> 5. Reference to published description of the language (book or article):
>
> Hanyu Pinyin, the most commonly used system for Mandarin
> Chinese romanization, has been the national standard of China since 1958, and an international standard (ISO 7098:1991, 2nd ed.) since 1982.
> See also the LOC page for the relation between Hanyu Pinyin and
> Wade-Giles: http://www.loc.gov/catdir/pinyin/romcover.html
>
> 6. Any other relevant information:
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20080929/d10ae946/attachment.htm 


More information about the Ietf-languages mailing list