Update to clarify combining characters

Eric Brunner-Williams ebw at abenaki.wabanaki.net
Tue Apr 22 18:38:52 CEST 2014

On 4/22/14 1:10 AM, Cary Karp wrote:
>> >... in Abenaki we use several ASCII character sequences
>> >inter-changeably ("ou", "w" and "8") as well as an "u atop o" character
>> >defined in one or more extensions to ASCII, which typewritters with
>> >half-height settings, and the character "8" have accommodated over the
>> >past century, in support of a local (to a zone) semantic, e.g.,
>> >equivalency of two labels, e.g., "ou.example" and "8.example" (or
>> >"wabanaki.example" and "8abanaki.example" and "ouabanaki.example"),
> Are there similar non-ASCII examples?

Values assigned outside the ASCII range for the "u-above-o" combined 
character in the UTC repertoire are  U+0222 and U+0223, reflecting the 
casing of Latin Script.

>> Obviously, what ICANN gTLD registry operators do is governed by contacts
>> between they and ICANN, and what ccTLD registry operators is also
>> governed, in part, by desires for consistency, but below (or outside) of
>> these namespaces with_local_  (not pervasive to all levels of the tree)
>> restrictions  on labels, what resolves is a local question -- local in
>> the sense of both the FQDN, the RRSet associated, and the resolvers to
>> which query(s) are made.
> Does this suggest that there are language communities with need to have
> such intricacy accommodated on lower levels of the gTLD/ccTLD namespace,
> who are willing to forgo the possibility of manifesting their languages
> directly in TLD labels?

Hmm. The cost of access to the IANA root zone for language communities 
not associated with an ISO-3166-1 assigned code point is bounded below 
by the cost of access to an ICANN new gTLD sales event, nominally a 
one-time fee to ICANN on the order of 200,000 USD with annual recurring 
fees on the order of a 50,000 USD, in addition to operational costs 
serving the language community (zone file generation and publication), 
and the transactional cost of providing policied create and modify 
access to the underlying database, associating labels and resources.

Locating the language community namespace subordinate to any but the 
terminal label separator removes the precondition of access, with or 
without an ISO-3166-1 assigned code point, and the attendant one-time 
and recurring annual fees. Owing to the feature (or defect) of the 
algorithm in current use, language communities requiring scripts which 
do not have a left-to-right directionality have fewer existing superior 
label association choices than language communities using scripts which 
have a left-to-right directionality.

There is in general no one-time fee to access subordinate zones, and in 
general the recurring cost of access is four orders of magnitude less 
than the recurring ICANN fee, leaving only the operational costs 
mentioned above.

I can't speak for the cost-benefit analysis of others, but for Modern 
Chinook, a language I've been working in since September, with a 
caseless Latin-based script containing combining characters (ch, c'h, 
kw, k'w, qw, q'w, tL, t'L, ts, t's, xw, Xw) (where "L" indicates a 
"barred-ell" and "X" indicates "x-with-dot-under"), each of which 
functions as a single lexical unit, as well as the rarer combining 
characters (dj, dz, zh) which also function as a single lexical unit, 
the subject of this thread, with a user community in the North-Eastern 
Pacific coastal area, as with the Wabanaki languages I cited originally 
(user community in the North-Western Atlantic coastal area), I can't 
identify a benefit I think likely to motivate significant de-allocation 
of resources from in-community language programs to an external consumer 
offering only a label and significant recurring annual fees and elevated 
operational cost and a loss of some aspects of sovereignty.

In sum, for the two instances of scripts with combining characters I've 
recited above, and the difference in cost of access to lower levels of 
the namespaces, Cary's suggestion appears to me to be likely to be shown 

I have to confess I'd no idea ICANN has a VIP program, or that the 
[b,c,d,...]name problem space was being addressed by this program.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20140422/abacccfb/attachment.html>

More information about the Idna-update mailing list