Last call for ISO 15924-based updates

Doug Ewell doug at ewellic.org
Fri Mar 13 01:46:24 CET 2009


John Cowan <cowan at ccil dot org> wrote:

> The whole point of the Zinh code is to signal that the diacritic 
> changes its script depending on the diacriticized letter.  The acute 
> accent, for example, has no script of its own; it is understood as a 
> Latin accent when placed on a Latin letter, but as a Greek accent when 
> placed on a Greek letter.

What Gérard may or may not be aware of, and what powers this entire 
explanation, is that in Unicode, a diacriticized letter may be 
represented as two encoded characters, one for the base letter and one 
for the diacritic.  For example, "a with acute" may be encoded as 
U+00E1, or it may be encoded as U+0061 plus U+0301.  In the second case, 
the detached acute accent U+0301 would have the "inherited script" 
nature.

This is different from ISO 8859-1 and most other character encodings, 
where "a with acute" can only be represented as a single precomposed 
character; and it explains why the concept of "inherited script" exists.

--
Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ



More information about the Ietf-languages mailing list