Distinguishing Greek and Greek

Addison Phillips addison.phillips at quest.com
Wed Mar 9 01:34:15 CET 2005


Not exactly. 

Language matching using the left-matching rule is the opposite of locale matching.

If you specify "el-Grkp-GR", you aren't supposed to get anything "less granular" than that (i.e. "el-grkp-GR-boont" matches your request but "el-grkp" does not).

In the second example you would get "el-GR-polyton" from A and nothing (or the local default language) from B.

In other words, with language matching you must specify the least acceptable content you'll accept. With locale matching you specify the most acceptable (and fall back).

Variants *do* have a significant impact on how the matching proceeds though.

Addison

Addison P. Phillips
Globalization Architect, Quest Software
Chair, W3C Internationalization Core Working Group

Internationalization is not a feature.
It is an architecture. 

> -----Original Message-----
> From: ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] On Behalf Of Mark Davis
> Sent: mardi 8 mars 2005 16:26
> To: IETF Languages Discussion; Michael Everson
> Cc: cldr at unicode.org
> Subject: Re: Distinguishing Greek and Greek
> 
> That is a possibility, but it is sub-optimal. It is thus again because (a)
> country differences are generally far less important than script
> (including
> major orthographic variants like monotonic vs polytonic), and (b) when
> language tags are matched, they are treated as most-significant-field
> first.
> 
> This is very similar in that respect to Hans vs Hant, which is a choice of
> which different subset of Han characters encoding in Unicode that are used
> to represent Chinese, and the same reasoning applies.
> 
> Thus for example, suppose I have a web page drawing together different
> sources of information. (I am simplifying the following example for
> illustration.)
> 
> 1. The desired text is el-Grkp-GR. I am drawing data from two sources A
> and
> B:
> A has data for el-Grkp-GR, el-Grkm-GR
> B has data for el-Grkm-GR and el-Grkp
> 
> What I get is then el-Grkp-GR from A and el-Grkp from B. That is, under
> the
> normal use of most-significant-field first, the best match in B for
> el-Grkp-GR is el-Grkp.
> 
> 2. Consider if we coded it with a variant. In that notation, I would be
> asking for el-GR-polyton, and
> A has data for el-GR-polyton, el-GR-monoton
> B has data for el-GR-monoton and el-polyton
> 
> What I get is then el-GR-polyton from A, but from B we get the wrong
> result -- we mix it with el-GR-monoton. That is, under the normal use of
> most-significant-field first, the best match in B for el-GR-polyton is the
> one that matches the first two fields, el-GR.
> 
> Option #2 (using a variant) presents a mixture of monotonic and polytonic
> to
> the user, which is not very satisfactory at all. Now, the one difference
> with Han would be if someone objected that Greek is only ever
> spoken/written
> in a single country, and there would never, ever, be any need to have a
> country variant. If that were the case, then encoding as a variant would
> not
> be as bad. But not being omniscient I am reluctant to make such a strong
> claim about the use of Greek!
> 
> ‎Mark
> 
> ----- Original Message -----
> From: "Michael Everson" <everson at evertype.com>
> To: "IETF Languages Discussion" <ietf-languages at iana.org>
> Cc: "Erkki Kolehmainen" <eik at iki.fi>
> Sent: Tuesday, March 08, 2005 11:00
> Subject: RE: Distinguishing Greek and Greek
> 
> 
> > At 10:57 -0800 2005-03-08, Addison Phillips wrote:
> > >8 characters is the maximum per RFC 3066:
> > >
> > >    The syntax of this tag in ABNF [RFC 2234] is:
> > >
> > >     Language-Tag = Primary-subtag *( "-" Subtag )
> > >
> > >     Primary-subtag = 1*8ALPHA
> > >
> > >     Subtag = 1*8(ALPHA / DIGIT)
> >
> > Grand so.
> > --
> > Michael Everson * * Everson Typography *  * http://www.evertype.com
> > _______________________________________________
> > Ietf-languages mailing list
> > Ietf-languages at alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/ietf-languages
> >
> 
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages



More information about the Ietf-languages mailing list