Update to clarify combining characters

John C Klensin klensin at jck.com
Wed Apr 23 09:33:43 CEST 2014



--On Tuesday, April 22, 2014 23:58 +0200 "J-F C. Morfin"
<jfc at morfin.org> wrote:

> John,
> it seems that all this is becoming political and economical
> issues totally out of the IETF end to end and my fringe to
> fringe scopes.

Jefsey,

While parts of it, to my surprise and disappointment, seem to be
getting worse (with the ICANN "variant" effort perhaps heading
the list), IDNs have always been driven by politics (especially
if that term is used in a broad sense).  They exist at the
boundary between the DNS, which was not intended for end-user,
natural-language, identifiers and a whole series of requirements
that arise as soon as one wants to accommodate reasonable
end-user expectations.  A broad version of the latter
requirements and issues include a need for flexibility about
input methods, display, and orthography while preserving
distinctions that are deemed to be important; a need to handle
synonyms and other types of strong aliasing; and tensions
between global and localized identifiers (I think your fringe to
fringe scope ideas fall into the latter category).   Those
issues exist within almost every natural, naturally-evolved,
language as well as between languages.  Most of them would exist
even without considering the additional constraints imposed when
one maps written language into codes that are more or less
convenient for computer use; few of them are issues when uses
are limited to reasonably convenient mnemonics rather than
"words" with bindings to dictionaries, standardized spelling,
and definitional or other semantics.

IMO, given that we are constrained by the limitations of a DNS
that was designed for a different set of requirements, IDNA
represents about as good a balance and set of tricks for getting
around those limitations as we are going to get (and probably as
is possible).  The tradeoffs could have been made differently,
but, at best, only by trading one set of issues for another: the
combination of a strong ASCII assumption (including special
handling of case) for octets between 0x00 and 0x7F and an
undefined situation for octets between 0x80 and 0xFF, weak
aliasing that lacks both abilities to have aliases for several
labels in the same FQDN at a time and to inquire about alternate
aliases, and ability to do language-sensitive and [other]
context-sensitive server-side flexible matching guarantee at
least rough edges for the user who expects locally-sensible and
locally-predictable behavior.  

We know how to solve those problems (at least "pretty well") but
the solutions require a different naming and name-resolution
environment: either in the form of an "above DNS" naming layer
or some flavor of DNSng that is designed around natural
language, end-user, requirements.  The decision to try to make
the DNS do the IDN job without a richer environment was
inherently political and economic and not technical.  everyone
who understood the technical limitations of the DNS and who had
even an elementary understanding of the natural language issues
has known all along that IDNs would not fully meet the end user
expectations.  

Given those DNS limitations, coming up with a technical solution
that would meet the requirement to be able to accommodate a
reasonable range of non-ASCII mnemonics within a DNS context,
using Unicode, and without causing more problems than necessary
was, and remains, within IETF scope.  Very little more than that
is.


> 
> At 22:15 22/04/2014, John C Klensin wrote:
>> Wabanaki is no different from the examples that Cary gives
>> for  Swedish except that Wabanaki, to paraphrase a comment
>> made about  another somewhat-endangered language, has a
>> considerably smaller  army and navy than Swedish.
> 
> The interest of such examples, when they are supported by
> dedicated and competent persons such as Eric, is that a good
> technology and governance are to be neutral particulars. This
> needs to be tested by dedicated people. This calls for the
> same amount of work and competence for every language. So the
> real issue is not so much to have a small or a big army/navy
> but to have motivated ones and the widest diversity of them.
> Each one contributes to the common interest.

Yes.  But nothing in what you say contradicts the assertion that
Eric's examples do not demonstrate that Wabanaki works more
poorly in IDNA than in Swedish (there are languages that do work
more poorly), nor Eric's belief that, as I understand it,
languages and scripts without a major and ongoing ICANN presence
and willingness to invest considerable resource are better
accommodated deeper in the DNS tree than at the root.  Although
I think it predates your involvement with this work, Eric will
remember a very similar discussion and conclusion in the late
1980s about the incorporation of nationality groups that did not
have the status of entities in ISO 3166-1 as ccTLDs.  That
discussion was less about economics but still very much able
policy and politics... long before ICANN.

>...

   john



More information about the Idna-update mailing list