Updating RFC 5890-5893 (IDNA 2008) to Full Standard

John C Klensin klensin at jck.com
Fri Nov 16 02:27:23 CET 2012

--On Thursday, November 15, 2012 15:56 -0800 Mark Davis ☕
<mark at macchiato.com> wrote:

> Anne,
> The development of IDNA2008 was a long, painful, and
> frustrating process, with a split between:
>    - people who were concerned with backwards compatibility
> and what would    happen during a migration period (such as
> most representatives to the    Unicode consortium, and
>    - people who did not feel that it was a concern (such as
> John, the other    authors, and most participants in the WG).
> The rough consensus of the WG was judged to be that backwards
> compatibility and migration were not concerns, and that's what
> went into idna2008.


Certainly there was a disagreement in the WG about which
objectively to prioritize.  However, characterizing the people
who disagreed with you as feeling "that backwards compatibility
and migration were not concerns" doesn't seem accurate to me.
Instead, many of us felt that they _were_ concerns, but that we
saw engineering/design tradeoffs between preserving
compatibility with whatever had been done under the IDNA2003
umbrella and other concerns, including:

(1) The higher degree of identifier (including URL) stability
and predictability that would result from a strict one-one
relationship between (in IDNA2008 terms) A-labels and U-labels
was an advantage and that the difficulties that had arisen from
the lack of that relationship in IDNA2003 was a concern.

(2) Some of the incompatible changes made in IDNA2008, including
permitting ZWJ and ZWNJ rather than discarding them and allowing
some characters that IDNA2003 had mapped away were significant
and that preserving the IDNA2003 behavior instead of allowing
them was a consideration.

(3) Because IRIs were not part of HTTP and were not even
specified until late 2004, at least some of us did not believe
that the use of native character strings in URLs represented
particularly good practice.  The amount of incompatibility
between URLs that conformed to IDNA2003 but (or and) that
contained Punycode-encoded labels was much less than that for
pseudo-URLs that contained native character ("Unicode") strings.

(4) We also expected that the Internet would continue to grow
and that the number of users and web pages that would be
adversely affected by the change would be proportionately far
fewer relative to the total by 2020, or even 2012, than they
were in 2008.   Of course, from that point of view, the longer
those who design web pages and the tools they use are told that
IDNA2003-like conventions will work for the indefinite future,
the larger the proportion of pages that do not conform strictly
to IDNA2008 will linger.   We recognized the tradeoff that
implied as well.  You will recall multiple questions about what
the conditions were for stopping and converting if UTR 46 were

On balance, the WG considered all of those factors and made the
choice -- more difficult for many of us than might have been
obvious in those painful discussions -- that tolerating some
incompatibility to get the advantages it would bring was
preferable to preserving the IDNA2003 behavior, and even
providing continuing support for non-conforming URLs.
Reasonable people can disagree with that decision, but nothing
about it implied that the issues on the other side of the
tradeoff were "not concerns".


