Standardizing on IDNA 2003 in the URL Standard

John C Klensin klensin at
Sat Jan 18 18:04:44 CET 2014

--On Friday, January 17, 2014 17:11 +0100 Bjoern Hoehrmann
<derhoermi at> wrote:

> I read Anne as saying, for the purposes of this discussion, he
> cares about the definition of a `uint8_t* f(codepoint_t*
> input) { ... }` function and not user interface or other
> issues. There was no impli- cation in the quoted text whether
> he cares about `f` being injective. (He might have said
> something about this elsewhere, but not here).


You may have successfully (even if accidentally) identified at
least two of the reasons why these conversations keep going
around in circles.  We seem to be having at least three separate
conversations.  In those conversations, people appear to have
orthogonal success criteria, which always makes communication
and agreement difficult.  So, to caricature them somewhat
(please do not take these are serious examples, just as

(1) Anne says something that I (and a few others) hear as "this
seems to be the existing practice, therefore it is the standard
and we should write it down as such".   One of us says "but we
have discovered that causes the following sorts of serious
problems and risks of leaving users very confused and, moreover,
the definition being used is just too fuzzy".  Anne says
something that sounds like "it is the existing practice, changes
cause damage too, and the existing practice must be perserved".
One of us says "but it causes these particular serious problems
and confuses users, that is real damage that will eventually
require browser changes".   Anne says something that sounds like
"But those changes haven't happened and may never happen.  And,
incidentally, what part of 'current practice' are you having
trouble understanding".

(2) You say "he cares about the definition of a `uint8_t*
> input) { ... }` function and not user interface...".  Some of
us just glaze over and wonder what on earth you think you are
talking about.  Others react and say "Unless we care about users
and user interfaces, there is absolutely no point in IDNs: as
pure identifiers and components of other identifiers, the
Internet (and other systems) can do perfectly well on ASCII
identifiers restricted to what is commonly known as the LDH
form.  In addition, if the issue is really an unambiguous
function, one wants the dual of that function to work and be
unambiguous too, and that means you have to prefer IDNA2008 over
IDNA2003, so what are we arguing about."

(3) A third group isn't really interested in discussions of
equivalence or mapping among characters except in the context of
string equivalencies.  They think the question of whether you
can or should be addressed as Bjoern, Björn, and/or Bjørn is
the really important one and that, if we aren't willing to
address that question, we are wasting everyone's time by
pretending to talk about internationalization and the DNS or
URIs.  Some of us (possibly including you and Anne) say "yes,
that is an important issue but we are completely bewildered by
your assuming that IDNA or URI have anything to do with it".
They say "but those are the tools you have given us" and "we
have no trouble understanding the relationship among those
strings, why are you being obtuse". We then try to explain
things, but they think our explanations are either dumb excuses
to avoid giving them what they want or just irrelevant.

I don't know how to make progress until we can agree on how to
determine success or even about how to state questions that
don't contain their own answers.   If we could agree that the
key question is "what has the installed base done and how to we
restate that into a standard?", we might still have
disagreements about important details (I think the oft-repeated
question of what, precisely, 'IDNA2003 plus updates for new
version of Unicode' means is such a detail), but we'd at least
be having a conversation that could lead to convergence.
Similarly, if we could agree that the linked questions of how we
maximize what the users want to see and minimize user
unhappiness and astonishment were the critical ones, we could
apply a great many other discussions about parts of those topics
to this one and have a conversation that could lead to

But as long as we cannot move forward from having three (or two
if what Anne is saying is actually what you characterize him as
saying) seemingly-unrelated conversations, we are probably just
wasting our time.


