Update to clarify combining characters

Cary Karp cary at karp.org
Tue Apr 22 10:10:00 CEST 2014


Quoting Eric,

> ... in Abenaki we use several ASCII character sequences
> inter-changeably ("ou", "w" and "8") as well as an "u atop o" character
> defined in one or more extensions to ASCII, which typewritters with
> half-height settings, and the character "8" have accommodated over the
> past century, in support of a local (to a zone) semantic, e.g.,
> equivalency of two labels, e.g., "ou.example" and "8.example" (or
> "wabanaki.example" and "8abanaki.example" and "ouabanaki.example"),

Are there similar non-ASCII examples?

> Obviously, what ICANN gTLD registry operators do is governed by contacts
> between they and ICANN, and what ccTLD registry operators is also
> governed, in part, by desires for consistency, but below (or outside) of
> these namespaces with _local_ (not pervasive to all levels of the tree)
> restrictions  on labels, what resolves is a local question -- local in
> the sense of both the FQDN, the RRSet associated, and the resolvers to
> which query(s) are made.

Does this suggest that there are language communities with need to have
such intricacy accommodated on lower levels of the gTLD/ccTLD namespace,
who are willing to forgo the possibility of manifesting their languages
directly in TLD labels?

Variation in keyboard practice otherwise appears in many contexts but it
is difficult to see how this can be weighed into the IDNA protocol. My
Swedish keyboard has separate keys for the direct entry of the last
three letters of the Swedish alphabet (å ä ö). These can, however, also
be typed by using the "dead key" that is necessary for the other
diacritically marked letters used in written Swedish. That method
requires the mark to be entered first but it neither displays nor
spaces. The letter with which it combines is then entered and the
corresponding pre-composed single-code point character is displayed.

I had always assumed that the trailing order of combining marks was
imposed directly by Unicode and that this simply cascaded into IDNA. Can
that constraint actually be overridden in any situation that would be
trapped by a new contextual rule in 5892?  (If new rules are going to be
added, there are a few others that might be suggested. Is that topic now
open for discussion?)

/Cary


More information about the Idna-update mailing list