The Two Lookups Approach (was Re: Parsing the issuesand finding a middle ground -- another attempt)

John C Klensin klensin at jck.com
Sun Mar 8 07:06:25 CET 2009



--On Saturday, March 07, 2009 20:44 -0800 Erik van der Poel
<erikv at google.com> wrote:

> I'm not suggesting that it needs special terminology. I'm just
> saying that something that might be a valid U-label according
> to a sender might be invalid according to a receiver.
> "U-label" is in the eye of the beholder. One says it is a
> U-label, the other says it isn't.

I see your point, but it really either is a U-label (given a
particular version of Unicode) or it isn't.  That is not really
a matter of interpretation or what people say.   And that
documents are actually quite clear about what happens when there
are different versions of Unicode in use, starting with the
prohibition against lookup up UNASSIGNED characters.

Beyond that, I recommend that people not say "interoperability
problem" or "security issue" any time they see a possible
difference in interpretation or status.  Not only does it
distract from the real interoperability problems, but there is a
little distributed hierarchical database and associated
accessing software and protocols out there that provides for
very slow updating and no guarantees about how long it takes to
bring the various servers that can deliver authoritative answers
into synch.  If one system asks a question and another asks the
same question at the same time, they may get different answers
and get them with no way to determine which one is more correct.
Under some circumstances, one system may get an answer and
another may be told, very definitively, that there is no
information available.  And all of those things can occur in
normal operation, with absolutely no attacks, bad guys, or
intentional bad behavior involved.  

That database system is called the DNS and it has used sloppy
synchronization and poor information about information creation
dates since its creation (TTLs and expiries are deltas from time
of inquiry, not creation-time-based-timeouts).  If we are going
to say that two systems, running different versions of Unicode
and therefore not being able to resolve all strings in exactly
the same way at a given time, are security and/or
interoperability problems, then the DNS is not an adequate
system on which to base IDNs (or much of anything else) or in
which IDNs can be implemented and we are through.  Completely
through.

     john




More information about the Idna-update mailing list