Distinguishing Greek and Greek

Deborah Goldsmith goldsmit at apple.com
Wed Mar 9 02:14:26 CET 2005


I agree that this situation is similar to that for Simplified and  
Traditional Han characters, and I believe it should be handled the  
same way (with a variant script identifier, as Mark proposes).

Deborah Goldsmith
Internationalization, Unicode liaison
Apple Computer, Inc.
goldsmit at apple.com

On Mar 8, 2005, at 4:25 PM, Mark Davis wrote:

> That is a possibility, but it is sub-optimal. It is thus again  
> because (a)
> country differences are generally far less important than script  
> (including
> major orthographic variants like monotonic vs polytonic), and (b) when
> language tags are matched, they are treated as most-significant- 
> field first.
>
> This is very similar in that respect to Hans vs Hant, which is a  
> choice of
> which different subset of Han characters encoding in Unicode that  
> are used
> to represent Chinese, and the same reasoning applies.
>
> Thus for example, suppose I have a web page drawing together different
> sources of information. (I am simplifying the following example for
> illustration.)
>
> 1. The desired text is el-Grkp-GR. I am drawing data from two  
> sources A and
> B:
> A has data for el-Grkp-GR, el-Grkm-GR
> B has data for el-Grkm-GR and el-Grkp
>
> What I get is then el-Grkp-GR from A and el-Grkp from B. That is,  
> under the
> normal use of most-significant-field first, the best match in B for
> el-Grkp-GR is el-Grkp.
>
> 2. Consider if we coded it with a variant. In that notation, I  
> would be
> asking for el-GR-polyton, and
> A has data for el-GR-polyton, el-GR-monoton
> B has data for el-GR-monoton and el-polyton
>
> What I get is then el-GR-polyton from A, but from B we get the wrong
> result -- we mix it with el-GR-monoton. That is, under the normal  
> use of
> most-significant-field first, the best match in B for el-GR-polyton  
> is the
> one that matches the first two fields, el-GR.
>
> Option #2 (using a variant) presents a mixture of monotonic and  
> polytonic to
> the user, which is not very satisfactory at all. Now, the one  
> difference
> with Han would be if someone objected that Greek is only ever  
> spoken/written
> in a single country, and there would never, ever, be any need to  
> have a
> country variant. If that were the case, then encoding as a variant  
> would not
> be as bad. But not being omniscient I am reluctant to make such a  
> strong
> claim about the use of Greek!
>
> ‎Mark
>
> ----- Original Message -----
> From: "Michael Everson" <everson at evertype.com>
> To: "IETF Languages Discussion" <ietf-languages at iana.org>
> Cc: "Erkki Kolehmainen" <eik at iki.fi>
> Sent: Tuesday, March 08, 2005 11:00
> Subject: RE: Distinguishing Greek and Greek
>
>
>> At 10:57 -0800 2005-03-08, Addison Phillips wrote:
>>> 8 characters is the maximum per RFC 3066:
>>>
>>>    The syntax of this tag in ABNF [RFC 2234] is:
>>>
>>>     Language-Tag = Primary-subtag *( "-" Subtag )
>>>
>>>     Primary-subtag = 1*8ALPHA
>>>
>>>     Subtag = 1*8(ALPHA / DIGIT)
>>
>> Grand so.
>> -- 
>> Michael Everson * * Everson Typography *  * http://www.evertype.com
>> _______________________________________________
>> Ietf-languages mailing list
>> Ietf-languages at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/ietf-languages
>>
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list