Distinguishing Greek and Greek
Deborah Goldsmith
goldsmit at apple.com
Wed Mar 9 02:14:26 CET 2005
I agree that this situation is similar to that for Simplified and
Traditional Han characters, and I believe it should be handled the
same way (with a variant script identifier, as Mark proposes).
Deborah Goldsmith
Internationalization, Unicode liaison
Apple Computer, Inc.
goldsmit at apple.com
On Mar 8, 2005, at 4:25 PM, Mark Davis wrote:
> That is a possibility, but it is sub-optimal. It is thus again
> because (a)
> country differences are generally far less important than script
> (including
> major orthographic variants like monotonic vs polytonic), and (b) when
> language tags are matched, they are treated as most-significant-
> field first.
>
> This is very similar in that respect to Hans vs Hant, which is a
> choice of
> which different subset of Han characters encoding in Unicode that
> are used
> to represent Chinese, and the same reasoning applies.
>
> Thus for example, suppose I have a web page drawing together different
> sources of information. (I am simplifying the following example for
> illustration.)
>
> 1. The desired text is el-Grkp-GR. I am drawing data from two
> sources A and
> B:
> A has data for el-Grkp-GR, el-Grkm-GR
> B has data for el-Grkm-GR and el-Grkp
>
> What I get is then el-Grkp-GR from A and el-Grkp from B. That is,
> under the
> normal use of most-significant-field first, the best match in B for
> el-Grkp-GR is el-Grkp.
>
> 2. Consider if we coded it with a variant. In that notation, I
> would be
> asking for el-GR-polyton, and
> A has data for el-GR-polyton, el-GR-monoton
> B has data for el-GR-monoton and el-polyton
>
> What I get is then el-GR-polyton from A, but from B we get the wrong
> result -- we mix it with el-GR-monoton. That is, under the normal
> use of
> most-significant-field first, the best match in B for el-GR-polyton
> is the
> one that matches the first two fields, el-GR.
>
> Option #2 (using a variant) presents a mixture of monotonic and
> polytonic to
> the user, which is not very satisfactory at all. Now, the one
> difference
> with Han would be if someone objected that Greek is only ever
> spoken/written
> in a single country, and there would never, ever, be any need to
> have a
> country variant. If that were the case, then encoding as a variant
> would not
> be as bad. But not being omniscient I am reluctant to make such a
> strong
> claim about the use of Greek!
>
> Mark
>
> ----- Original Message -----
> From: "Michael Everson" <everson at evertype.com>
> To: "IETF Languages Discussion" <ietf-languages at iana.org>
> Cc: "Erkki Kolehmainen" <eik at iki.fi>
> Sent: Tuesday, March 08, 2005 11:00
> Subject: RE: Distinguishing Greek and Greek
>
>
>> At 10:57 -0800 2005-03-08, Addison Phillips wrote:
>>> 8 characters is the maximum per RFC 3066:
>>>
>>> The syntax of this tag in ABNF [RFC 2234] is:
>>>
>>> Language-Tag = Primary-subtag *( "-" Subtag )
>>>
>>> Primary-subtag = 1*8ALPHA
>>>
>>> Subtag = 1*8(ALPHA / DIGIT)
>>
>> Grand so.
>> --
>> Michael Everson * * Everson Typography * * http://www.evertype.com
>> _______________________________________________
>> Ietf-languages mailing list
>> Ietf-languages at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>>
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
More information about the Ietf-languages
mailing list