Korean romanizations (Was: Japanese transliteration: ja-Latn-hepburn)

Phillips, Addison addison at amazon.com
Sat Sep 12 20:43:55 CEST 2009

I think it is valid to question whether the requested ‘hepburn’ subtag does, or should, or should not include all manner of Hepburn. The requester clarified his intentions and needs by requesting an additional subtag: that seems to me to fit within the range of normal discussion concerning a registration request. He could also have clarified by stating that he wanted only a subtag for the revised Hepburn originally requested.

I tend to think that all manner of language subtag related things may be discussed on the list, including encouraging or discouraging people from requesting specific things (cf. the machine translation thread) or suggesting various ways of organizing requests, but these discussions become moot if and when a request is made. If Mark had not agreed to widen his request to include Tibetan or felt that his request was being damaged by that change, he could have objected. Similarly, others could have requested the Prefix “bo-Latn” separately from the original request to arrive at the same result.

The arbitrariness you are experiencing tends, in my opinion, to be due to a lack of focus on the process as laid out. It is rare that requesters are reminded that they do not have to bow to feedback. RFC 5646 *requires* that the original requester approve any changes to a particular request. “No-no’s” from list correspondents discouraging a request mean nothing if the request goes on to be made (of course, if the reasons for the “no-no’s” were valid, they might influence the decision of the LSR about the subtag, but that is a separate matter).


Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Kent Karlsson
Sent: Saturday, September 12, 2009 10:41 AM
To: Peter Constable; Mark Davis
Cc: ietf-languages at iana.org; Doug Ewell
Subject: Re: Korean romanizations (Was: Japanese transliteration: ja-Latn-hepburn)

I don't recall anyone urgently needing, and indeed nobody asked for (it came up during discussion), a subtag "pinyin" for Tibetan. Still it got registered. And I don't complain about that. But I do complain (a bit) about the arbitrariness. In the current line of discussion, originally only "revised Hepburn" was asked for. This has during discussion been widened and expanded. But somehow stepping over to Korean results in lots of no-no's. **IF**, perchance, Hepburn romanisation **had** applied also to Korean, would we then had seen these no-no's? There wasn't much objection when "pinyin" was widened to cover also Tibetan as a prefix (something which could have been done later). And why the no-no's when it is suggested that maybe we should register some well-known romanisation for Korean? Just arguing a bit in the hypothetical here, since I'm uncomfortable with too much arbitrariness.

        /kent k

Den 2009-09-12 18.16, skrev "Peter Constable" <petercon at microsoft.com>:
I’m inclined to agree with Mark: we could spend a bunch of time coming up with tags for all kinds of variant text representations with no knowing which people are actually interested in using. I personally don’t want to occupy my time that way. When someone needs a subtag for Hangul Romanization, they’ll come asking, and we’ll sort it out then. I don’t get the impression Doug has an actual usage need at this time.


From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-bounces at alvestrand.no] On Behalf Of Kent Karlsson
Sent: Saturday, September 12, 2009 3:41 AM
To: Mark Davis
Cc: ietf-languages at iana.org; Doug Ewell
Subject: Re: Korean romanizations (Was: Japanese transliteration: ja-Latn-hepburn)

See below.

Den 2009-09-12 02.34, skrev "Mark Davis" <mark at macchiato.com>:


On Fri, Sep 11, 2009 at 03:58, Kent Karlsson <kent.karlsson14 at comhem.se> wrote:

Den 2009-09-11 05.04, skrev "Doug Ewell" <doug at ewellic.org>:

> Geez, all I had in mind for Korean was registering the three most common
> romanizations, which anyone familiar with Korean could name off the top
> of their head.

I would suggest that you just submit the appropriate registration forms
to the list. I don't think there is the requirement that the submitter
promises to "start using the subtag for the submitter's immediate needs",
nor that the submitter has been using the variants for which subtags are
requested. I think the subtags you allude to here are useful enough to be

No, but what I'm afraid of is that then someone else will say, well, we might as well do Thai romanizations, and then Lao, and then Russian, and then... There is a pretty unending supply of these things.

Well, that is what this group is here for... Processing such requests.

 1.  "Are useful enough to be registered"
 2.  "Are useful enough to be registered, and someone has a need for it"
I'm just saying that as a working process, #2 gets people what they need, and is manageable by this group. #1 would completely swamp this group.

"Need" does not necessarily mean "we need this in our system now".

    /kent k

This is in contrast to trying to register a subtag for a lone pronunciation
quirk (by itself that hardly makes a dialect) or for using old road/highway
names/numbers instead of their newer names/numbers (totally irrelevant for
language tagging, methinks).

Hardly makes a dialect?  If you are referring to the example I gave, of en-US-socapgfr, it is probably spoken by as many or more people than speak your native language...

Number of speakers does not a dialect make. But I guess this comes down to where the border between dialect and minor pronunciation quirk(s) goes. And I'm sure that can be hotly debated. I would still not count highway reference method (old or new names) as relevant in that regard. Very few dialects have been registered in LSR. If requests for more of them come in, I think it would be good to try to align them with how dialectologists delineate them.
Of course, I was using an extreme example, but it was to make a point. For *somebody* that level of precision might be important, just as for *somebody* the distinction between sl-rozaj-biske and sl-rozaj-njiva, or between sl-rozaj-biske and sl-rozaj-biske-1994 is important. We can't anticipate all the possible differences that people might need, nor can or should we try to second guess that in advance, but what we can do is make sure that the right tags are at the right levels of breadth when we do get requests.

But it also makes the registry very uneven when it comes to variants. I'm not sure how great hopes (or worries) one should have for ISO 639-6, I have seen neither text nor tabular data for it, but maybe it could make for a bit more "evenness" w.r.t. variants. (One would have to reactivate LTRU for using  -6 in the LSR...)

And overly specific variants, such as your "socapgfr" example, should not become registered anyway (IMO), even if someone were to ask for it. This is what private-use subtags are for.
And transliterations are notoriously tricky. What we actually use in CLDR for Korean, for example, is a "Korean Ministry of Culture & Tourism Transliteration with Clause 8 And Further Modifications for Reversibility Because the Source is Underspecified". (http://cldr.unicode.org/index/cldr-spec/transliteration-guidelines#Korean)
[Just noticed that we need to fix the links; the Ministry keeps moving the pages around -- sigh.]

That would surely be overly narrow for a registered variant subtag. Cmp. "pinyin", "hepburn", ...

        /kent k
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/ietf-languages/attachments/20090912/76b9b68b/attachment-0001.htm 

More information about the Ietf-languages mailing list