Suppress-Script for Korean?

Doug Ewell dewell at roadrunner.com
Wed Jul 25 08:36:57 CEST 2007


Randy Presuhn <randy underscore presuhn at mindspring dot com>

>> Are you suggesting that if a document is entirely in (say) hiragana I 
>> shouldn't tag it ja-Hira because Hira is considered a subset of Japn 
>> and Japn is to be suppressed?
>
> For a document longer than a few words to be purely "Hira" would be 
> *very* artifical, and consequently I'd expect it to be marked as such, 
> just as an extended document solely in Kanji or Katakana would also be 
> quite artifical.   On the other hand, let's say it's just a quote of a 
> word or two which would normally be written in pure "Hira".  In that 
> case, it's not a "marked form", so I'd say that simply tagging it "ja" 
> would normally be the appropriate thing to do.

Randy and I are in vigorous agreement on this issue.  My position, as 
stated earlier, is that "Kore" represents "the Korean writing system" 
which typically contains a very large proportion of Hangul and a very 
small amount of Hanja.  According to this model, a text that happens to 
be 100% Hangul could still be considered "Kore" if the essence of the 
writing system is that Hanja are not explicitly avoided.  I consider 
this analogous to Addison's (and others') principle that a text can be 
"Latn" even if contains a few Greek or Cyrillic letters, as long as the 
essence of the writing system is Latin.

My counterexample is that a children's book that deliberately avoids all 
Hanja would be a suitable example of 'Hang'.

> I think the real consideration is this: does adding the script subtag 
> provide information that could not be reasonably inferred from "ja". 
> "-Japn" would not provide useful information.  "-Hira" marks the text 
> as being quite out of the ordinary.

(Side note: many people have written "Japn', but the actual ISO 15924 
code element and script subtag is 'Jpan'.)

I agree again.  Script subtags, like all subtags, should be used if they 
help identify the linguistic usage, and should be avoided if they don't. 
Suppress-Script exists to help identify the cases where a script subtag 
would be superfluous.

The question is really:

(a) whether most people who write "ko" alone generally mean "Hangul plus 
Hanja," whether they generally mean "Hangul only," or whether the two 
cases are sufficiently equal that neither can be presumed, and

(b) whether "Hangul plus Hanja" as a concept is applicable even for 
examples than contain no Hanja.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list