Korean (and Japanese) (was: Re: Suppress-Script batch 1)

Martin Duerst duerst at it.aoyama.ac.jp
Thu Sep 28 06:53:44 CEST 2006

At 03:06 06/09/28, Doug Ewell wrote:

>Last March I began a small research project to determine whether Korean should have a Suppress-Script of Hangul.  I read books, scanned Korean-language newspapers published in Seoul and in Los Angeles, checked numerous Web sites, talked to native Korean speakers in person and by e-mail.  The result was that Hangul is, in fact, used an overwhelming proportion of the time to write modern Korean -- a colleague at work actually used the word "overwhelming" -- but there is still a certain, tiny, regular pattern of usage of Han (hanja) in scholarly works and newspapers.  "Regular" was the key here.  So I concluded that a Suppress-Script for Korean actually would NOT be a good idea, which is not what I expected to find.

This is a particularly tricky one. In modern Korean, what you may want to distinguish
in practice is texts written exclusively in Hangul and texts written mostly in
Hangul, but with a few Hanja interspersed. Historically, you also want to distinguish
texts written exclusively in Hanja.

The "a few Hanja interspersed" can, as far as I understand, go from a very low percentage
(such as <1%, an example I have seen is books for the board game of Go, where
Hanja are used for 'black' and 'white', but nothing else) to a much higher
percentage (a potential example might be a political treatise where all the
names and all the technical terms are in Hanja). In terms of readability, these
two examples, althoug both mixtures, have to be evaluated quite differently:
The former is about as easy to read as pure Hangul, while the later is quite
a bit more difficult.

In that sense, one can ask oneself whether an example like the former cannot
just be tagged as Hangul, in the same way as e.g. a Latin text with an
occasional Greek or other character would still be tagged as Latin.
For the later example, the question is whether something similar to
    Subtag: Jpan
    Description: Japanese (alias for Han + Hiragana + Katakana)
should be defined for the Korean case. The problem is that while
conventions for what to write with Kanji, Hiragana, and Katakana in
modern Japanese are quite well established, the practice is quite
a bit more varying in Korean, the way I understand it.

Speaking about Japanese, I'm quite surprised that we don't have
    Supress-script: Jpan
    Type: language
    Subtag: ja
    Description: Japanese
    Added: 2005-10-16
When something is tagged ja, the assumption is that it's written in
Kanji-Kana mixture, because that's how Japanese is written. Not having
a Suppress-script would mean that strictly speaking, ja-Jpan should be
used, which nobody does and would be silly anyway.

>One important fact to keep in mind is that Suppress-Script values may be not only added, but also removed (Section 3.4, item 7).  If we make a mistake, we can go back and fix it.

Oh, at least one thing that we can fix. We should definitely
leave it that way.

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     

More information about the Ietf-languages mailing list