Update to clarify combining characters
ebw at abenaki.wabanaki.net
Tue Apr 22 18:38:52 CEST 2014
On 4/22/14 1:10 AM, Cary Karp wrote:
>> >... in Abenaki we use several ASCII character sequences
>> >inter-changeably ("ou", "w" and "8") as well as an "u atop o" character
>> >defined in one or more extensions to ASCII, which typewritters with
>> >half-height settings, and the character "8" have accommodated over the
>> >past century, in support of a local (to a zone) semantic, e.g.,
>> >equivalency of two labels, e.g., "ou.example" and "8.example" (or
>> >"wabanaki.example" and "8abanaki.example" and "ouabanaki.example"),
> Are there similar non-ASCII examples?
Values assigned outside the ASCII range for the "u-above-o" combined
character in the UTC repertoire are U+0222 and U+0223, reflecting the
casing of Latin Script.
>> Obviously, what ICANN gTLD registry operators do is governed by contacts
>> between they and ICANN, and what ccTLD registry operators is also
>> governed, in part, by desires for consistency, but below (or outside) of
>> these namespaces with_local_ (not pervasive to all levels of the tree)
>> restrictions on labels, what resolves is a local question -- local in
>> the sense of both the FQDN, the RRSet associated, and the resolvers to
>> which query(s) are made.
> Does this suggest that there are language communities with need to have
> such intricacy accommodated on lower levels of the gTLD/ccTLD namespace,
> who are willing to forgo the possibility of manifesting their languages
> directly in TLD labels?
Hmm. The cost of access to the IANA root zone for language communities
not associated with an ISO-3166-1 assigned code point is bounded below
by the cost of access to an ICANN new gTLD sales event, nominally a
one-time fee to ICANN on the order of 200,000 USD with annual recurring
fees on the order of a 50,000 USD, in addition to operational costs
serving the language community (zone file generation and publication),
and the transactional cost of providing policied create and modify
access to the underlying database, associating labels and resources.
Locating the language community namespace subordinate to any but the
terminal label separator removes the precondition of access, with or
without an ISO-3166-1 assigned code point, and the attendant one-time
and recurring annual fees. Owing to the feature (or defect) of the
algorithm in current use, language communities requiring scripts which
do not have a left-to-right directionality have fewer existing superior
label association choices than language communities using scripts which
have a left-to-right directionality.
There is in general no one-time fee to access subordinate zones, and in
general the recurring cost of access is four orders of magnitude less
than the recurring ICANN fee, leaving only the operational costs
I can't speak for the cost-benefit analysis of others, but for Modern
Chinook, a language I've been working in since September, with a
caseless Latin-based script containing combining characters (ch, c'h,
kw, k'w, qw, q'w, tL, t'L, ts, t's, xw, Xw) (where "L" indicates a
"barred-ell" and "X" indicates "x-with-dot-under"), each of which
functions as a single lexical unit, as well as the rarer combining
characters (dj, dz, zh) which also function as a single lexical unit,
the subject of this thread, with a user community in the North-Eastern
Pacific coastal area, as with the Wabanaki languages I cited originally
(user community in the North-Western Atlantic coastal area), I can't
identify a benefit I think likely to motivate significant de-allocation
of resources from in-community language programs to an external consumer
offering only a label and significant recurring annual fees and elevated
operational cost and a loss of some aspects of sovereignty.
In sum, for the two instances of scripts with combining characters I've
recited above, and the difference in cost of access to lower levels of
the namespaces, Cary's suggestion appears to me to be likely to be shown
I have to confess I'd no idea ICANN has a VIP program, or that the
[b,c,d,...]name problem space was being addressed by this program.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update