Distinguishing Greek and Greek
Deborah Goldsmith
goldsmit at apple.com
Wed Mar 9 21:08:04 CET 2005
On Mar 8, 2005, at 7:25 PM, John Cowan wrote:
>> This is very similar in that respect to Hans vs Hant, which is a
>> choice of
>> which different subset of Han characters encoding in Unicode that
>> are used
>> to represent Chinese, and the same reasoning applies.
>
> I agree with Michael that these are not separate characters at the
> user level.
> This is an orthographical reform, not a change of script.
Hans and Hant are not a change of script, either. The two are subsets
of Hani, and there is considerable overlap (Lee Collins estimates
perhaps 60-70%). This is what John Jenkins, who knows quite a bit
about both polytonic Greek and Han characters, had to say:
> I'd say the situations are analogous myself, as both arose from
> relatively recent attempts at language reform in pretty much
> similar ways. Indeed, I'd say that the rationale for separating
> polytonic and monotonic Greek is even stronger than for Hans and
> Hant, because the two Greeks are more clearly separated than the
> two Chineses.
>
> Yes, Hans and Hant have considerable overlap. It's even relatively
> simple to come up with a sentence (e.g., 他是我的朋友) where
> you can't tell whether it's the one or the other on any basis other
> than external tagging. Of course, that's a mildly artificial
> example. In real life, you'd generally not manage to go a complete
> sentence without finding one simplified character along the way --
> but the point is that they have a huge overlap in Unihan.
>
> FWIW, while there are in Unihan only 2636 characters which can be
> considered simplified forms, 1901 of those are in IICore; that is,
> simplified forms are relatively common in actual text. And since
> 1901 (or even 2636) characters are less than a quarter of the bare
> minimum for modern communication, the remainder of the characters
> needed for Hans are naturally enough characters shared with Hant.
>
> (Another measure is that of the 6763 characters from Unihan with
> GB0 mappings, 4383 also have Big Five mappings.)
I think the argument for tagging the polytonic/monotonic distinction
as a script subset should be examined in more detail.
Deborah Goldsmith
Internationalization, Unicode liaison
Apple Computer, Inc.
goldsmit at apple.com
More information about the Ietf-languages
mailing list