John C Klensin
klensin at jck.com
Sun Feb 20 16:56:06 CET 2011
--On Sunday, February 20, 2011 16:14 +0100 Simon Josefsson
<simon at josefsson.org> wrote:
> Personally I feel uncomfortable with changes that takes us
> from two different compliant IDNA algorithms to three
> different compliant IDNA algorithms depending on when in time
> you implemented the standards: IDNA2003, IDNA2008-original,
> IDNA2008-revised. This is damaging from a security
> perspective. I don't know what the alternative is though.
Simon, from my perspective the difficulty here is that, for the
edge-case characters involved, the fact that Unicode made a
change (to correct an earlier misunderstanding/error) triggers
an incompatibility no matter how IETF responds. Trying to use
categories similar to those above, we have:
IDNA2003. As you point out, IDNA2003 has its own
issues, issues that are complicated by local decisions
about whether to reduce possible visual confusion
opportunities by mapping a native character form through
ToASCII to a Punycode-encoded form and then through
ToUnicode before displaying it. Some folks believe that
behavior is useful, at least in some circumstances, and
the specification certainly does not prohibit it.
and then either
(i) IDNA2008, original algorithm. This avoids any
changes in the spec, avoids any changes to systems that
implement the algorithm, and avoids any changes to any
implementation that derives its own tables from the spec
when new version of Unicode appear. Those
implementations include the one that generates the the
updated IANA tables, unless a change is made. However,
because their properties have changed, the categories
for the two edge-case characters change.
(ii) IDNA2008, algorithm updated to make special
provision for those characters by adding them to the
exception list. This means that code which generates
the tables has to change. The tables change anyway
because Unicode 6.0 adds characters, changing some code
points from UNASSIGNED to PVALID. But the IDNA
categories of the relevant edge-case characters change.
but not both.
If one thinks about the world in terms of either code stability
for systems that generate tables or normative stability for the
standard and its rules, then the apparent consensus decision
reflected in the current draft (and (ii) above) is completely
stable; nothing is happening. If one thinks about the world in
terms of table stability, with the tables really being normative
and not the rules, then it is probably (iii) that is stable
because the table values for those few characters don't change
even if the tables themselves do (because hundreds of new
characters are now defined and allocated.
FWIW, there are only two big conceptual changes between IDNA2003
and IDNA2008 and this is one of them: IDNA2003 used normative
tables and the IDNA2008 uses normative rules.
Of course, in practical terms, these characters are so unlikely
to show up in IDNs --except as demonstrations or out of
malice--that the decision we make about them is unlikely to make
any difference at all. The issue has much more to do with how
we think about things and what precedents are being set.
More information about the Idna-update