Proposal to add "Kore' as Suppress-Script for 'ko'

Addison Phillips addison at yahoo-inc.com
Wed Jul 11 16:24:52 CEST 2007


+1

Suppress-Script is an informative field that helps users know when not 
to use a particular script with a particular language. Really it should 
have taken a different form (an informative field indicating when using 
a script subtag should be used, as with Mark's Serbian example), but 
that wouldn't have addressed the concerns of the folks that wanted this 
field: that user's would not know that not to use the script in forming 
the tag "en-Latn-US" because there was nothing in the 'en' record.

Usually people are much better off following the "Tag Content Wisely" 
rules and not using subtags that add no information than they are 
relying on Suppress-Script. Certainly this is true for minority 
languages: there is no end to potential Suppress-Script registrations, 
there is a lack of information/motivation to complete the set of 
Suppress-Scripts in the registry, and there are cases where 
Suppress-Script should be ignored.

Adding "kore" to "ko" seems natural enough, but I also think it 
problematic (shouldn't we suppress pure Hangul too??). In fact and in 
practice, the script subtag says nothing about the range of characters 
in a document bearing the subtag. I could label this document as being 
"en-Latn-US", even though it includes the word 文字化け.

Addison

Mark Davis wrote:
> Suppress Script is not designed for "automated stuff", nor is it an issue
> that:
> 
>> This
>> Suppress-Script thing, how is it to know that a paragraph in Chinese
>> with three words in Hangul in the middle isn't really Chinese and not
>> Korean?
> 
> After all, one could say this same thing about Japanese: 3 words in Hira in
> the middle of a sequence of Hani; where ja *does* have Suppress-Script of
> Japn.
> 
> So what is Suppress-Script actually supposed to do? Suppress-Script is to
> give a "preferred form", for compatibility, where a script is not really
> normally needed. Here is an example of that. If I have the language
> identifier en-Latn-US, the preferred form of it is en-US. That is 
> because en
> is customarily not written with anything but characters from Latn. 
> Similarly
> the preferred forms of
> 
> ru-Cyrl-RU
> sr-Cyrl-RS
> ja-Jpan-JP
> ko-Kore-KO
> ...
> 
> are
> 
> ru-RU
> sr-Cyrl-RS // not suppressed, because sr is also customarily written in 
> Latn
> ja-JP
> ko-KO // should be suppressed IMO
> ...
> 
> ja-JP has a suppress script of Jpan because it is customarily not written
> with anything but characters from Jpan. Because Korean is customarily never
> written with any characters outside of Kore, it makes sense to suppress
> Kore; there is really no normal need to supply the script for Korean.
> 
> It of course *does* sense to say ko-Hani or ko-Hang, just as it makes sense
> to say ja-Kana, ja-Hira, or ja-Hang. And in specialized cases one could use
> these, just as in specialized cases one can distinguish ru-RU from
> ru-Cyrl-RU from ru-Latn-RU (in transliteration). But for normal purposes 
> one
> does not need to mention a script with ko.
> 
> Mark
> 
> On 7/11/07, Michael Everson <everson at evertype.com> wrote:
>>
>> At 22:56 -0700 2007-07-10, Doug Ewell wrote:
>>
>> >The North Korean national character encoding, KPS 9566-97, includes
>> >4,653 Hanja, or more than half the total number of encoded
>> >characters. Then again, it also includes Latin, Greek, Cyrillic,
>> >kana, and a fair number of dingbats.
>> >
>> >I'm willing to withdraw this request if people think it is not
>> appropriate.
>>
>> I think it's wrong. Kore is an alias for two scripts. This
>> Suppress-Script thing, how is it to know that a paragraph in Chinese
>> with three words in Hangul in the middle isn't really Chinese and not
>> Korean? Kore makes sense from the point of view of ISO 15924, but
>> not, I think, in terms of the automated stuff Suppress-Script is
>> supposed to do.
>>
>> I could be wrong. But my impression is that this request is unsafe.
>> -- 
>> Michael Everson * http://www.evertype.com
>> _______________________________________________
>> Ietf-languages mailing list
>> Ietf-languages at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>>
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.
Chair -- W3C Internationalization Core WG

Internationalization is an architecture.
It is not a feature.


More information about the Ietf-languages mailing list