Last call for ISO 15924-based updates
Phillips, Addison
addison at amazon.com
Fri Mar 13 17:22:38 CET 2009
>
> In my opinion, this is a very important comment.
Actually, it's a rather misleading comment. ISO 10646 has the same combining marks that Unicode does and you certainly can use <0061 0301> to represent an 'a' with an acute accent over it---whether one chooses to call that "a with acute" or not. And none of this has any bearing here, except to note that U+0301 inherits its script from the base character its associated with (which was John's point). You could, for example, use it with a Cyrillic or Greek letter instead.
None of that makes 'Zinh' useful in a language tag. Neither does it constitute a reason not to register 'Zinh' (whose registration is mandatory). It might make a good case for a comment to that effect, as Doug has suggested.
Addison
Addison Phillips
Unicode Partisan
Internationalization is not a feature.
It is an architecture.
> Cordialement.
> Gérard LANG
>
> -----Message d'origine-----
> De : ietf-languages-bounces at alvestrand.no [mailto:ietf-languages-
> bounces at alvestrand.no] De la part de Keld Jørn Simonsen
> Envoyé : vendredi 13 mars 2009 15:26
> À : Doug Ewell
> Cc : ietf-languages at iana.org
> Objet : Re: Last call for ISO 15924-based updates
>
> On Thu, Mar 12, 2009 at 06:46:24PM -0600, Doug Ewell wrote:
> > John Cowan <cowan at ccil dot org> wrote:
> >
> > > The whole point of the Zinh code is to signal that the
> diacritic
> > > changes its script depending on the diacriticized letter. The
> acute
> > > accent, for example, has no script of its own; it is understood
> as a
> > > Latin accent when placed on a Latin letter, but as a Greek
> accent
> > > when placed on a Greek letter.
> >
> > What Gérard may or may not be aware of, and what powers this
> entire
> > explanation, is that in Unicode, a diacriticized letter may be
> > represented as two encoded characters, one for the base letter
> and one
> > for the diacritic. For example, "a with acute" may be encoded as
> > U+00E1, or it may be encoded as U+0061 plus U+0301. In the
> second
> > U+case,
> > the detached acute accent U+0301 would have the "inherited
> script"
> > nature.
>
> W should rather use the ISO standards here, and we do in IETF.
> In ISO 10646 the character "a with acute" can only be represented
> in one way. namely as U00E1. The other string you are citing is two
> characters in ISO 10646 (and not the "a with acute" character).
>
> best regards
> keld
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/ietf-languages
More information about the Ietf-languages
mailing list