Last call for ISO 15924-based updates

Keld Jørn Simonsen keld at dkuug.dk
Fri Mar 13 15:25:52 CET 2009


On Thu, Mar 12, 2009 at 06:46:24PM -0600, Doug Ewell wrote:
> John Cowan <cowan at ccil dot org> wrote:
> 
> > The whole point of the Zinh code is to signal that the diacritic 
> > changes its script depending on the diacriticized letter.  The acute 
> > accent, for example, has no script of its own; it is understood as a 
> > Latin accent when placed on a Latin letter, but as a Greek accent when 
> > placed on a Greek letter.
> 
> What Gérard may or may not be aware of, and what powers this entire 
> explanation, is that in Unicode, a diacriticized letter may be 
> represented as two encoded characters, one for the base letter and one 
> for the diacritic.  For example, "a with acute" may be encoded as 
> U+00E1, or it may be encoded as U+0061 plus U+0301.  In the second case, 
> the detached acute accent U+0301 would have the "inherited script" 
> nature.

W should rather use the ISO standards here, and we do in IETF.
In ISO 10646 the character "a with acute" can only be represented 
in one way. namely as U00E1. The other string you are citing is two
characters in ISO 10646 (and not the "a with acute" character).

best regards
keld


More information about the Ietf-languages mailing list