Last call for ISO 15924-based updates

Phillips, Addison addison at amazon.com
Fri Mar 13 17:22:38 CET 2009


> 
> In my opinion, this is a very important comment.

Actually, it's a rather misleading comment. ISO 10646 has the same combining marks that Unicode does and you certainly can use <0061 0301> to represent an 'a' with an acute accent over it---whether one chooses to call that "a with acute" or not. And none of this has any bearing here, except to note that U+0301 inherits its script from the base character its associated with (which was John's point). You could, for example, use it with a Cyrillic or Greek letter instead.

None of that makes 'Zinh' useful in a language tag. Neither does it constitute a reason not to register 'Zinh' (whose registration is mandatory). It might make a good case for a comment to that effect, as Doug has suggested.

Addison


Addison Phillips
Unicode Partisan

Internationalization is not a feature.
It is an architecture.





> Cordialement.
> Gérard LANG
> 
> -----Message d'origine-----
> De : ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] De la part de Keld Jørn Simonsen
> Envoyé : vendredi 13 mars 2009 15:26
> À : Doug Ewell
> Cc : ietf-languages at iana.org
> Objet : Re: Last call for ISO 15924-based updates
> 
> On Thu, Mar 12, 2009 at 06:46:24PM -0600, Doug Ewell wrote:
> > John Cowan <cowan at ccil dot org> wrote:
> >
> > > The whole point of the Zinh code is to signal that the
> diacritic
> > > changes its script depending on the diacriticized letter.  The
> acute
> > > accent, for example, has no script of its own; it is understood
> as a
> > > Latin accent when placed on a Latin letter, but as a Greek
> accent
> > > when placed on a Greek letter.
> >
> > What Gérard may or may not be aware of, and what powers this
> entire
> > explanation, is that in Unicode, a diacriticized letter may be
> > represented as two encoded characters, one for the base letter
> and one
> > for the diacritic.  For example, "a with acute" may be encoded as
> > U+00E1, or it may be encoded as U+0061 plus U+0301.  In the
> second
> > U+case,
> > the detached acute accent U+0301 would have the "inherited
> script"
> > nature.
> 
> W should rather use the ISO standards here, and we do in IETF.
> In ISO 10646 the character "a with acute" can only be represented
> in one way. namely as U00E1. The other string you are citing is two
> characters in ISO 10646 (and not the "a with acute" character).
> 
> best regards
> keld
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages


More information about the Ietf-languages mailing list