Additional thoughts on TRANSITIONAL

Thu Dec 3 18:48:07 CET 2009

Hi,

I've been thinking more about the TRANSITIONAL approach, and the sort
of registry-bundle + sunset clause discussion, and I'm wondering if
the two approaches can't be combined to solve the harms that people
seem to feel are present especially from ß.  This is a drafty outline
of how a combined approach could result in the ß being PVALID.  I'm
concentrating on its case right now because (1) it's a slightly
simpler problem and (2) it seems to be the case that many think has
the greatest potential for harm.  If what I am suggesting satisfies
those who think ß is too dangerous, I think the same strategy more or
less can be adopted for other problematic cases.  This is extremely
hand-wavy right now (as it has been in the past when I've described it
casually to people).  I haven't really worked through the details,
but if anyone thinks this is a not-insane way out of the current bind,
then I'll work on it some more.

I should say to begin with that I view our current problem as
basically a co-ordinated update problem.  It's impossible to declar a
flag day.  Those arguing for mapping are basically arging from the
danger of backward _ambiguous_ compatibility (thanks to Mark for
putting it so clearly).  I regard that objection as a reasonable one,
but I am not totally convinced that the solution ("never ever use
this") is the right answer.  If we had a way to co-operatively
introduce the characters in a predictable way, then clients could know
what to do.

First, the characters we're worried about go into a TRANSITIONAL or
MAYBE category.  If we call it TRANSITIONAL, then I suggest we set
some sunset date by which the TRANSITIONAL characters are just
automatically PVALID, as was already suggested, but I don't feel
strongly about this.  If we use MAYBE, then this feature is permanent.
I prefer TRANSITIONAL because it makes this go away (what I'm
proposing is a kludge).  This would be required immediately, so that
we could go ahead with the rest of the changes in IDNA2008.

We write another document about lookups (or just add a restriction to
protocol) that says how clients looking up these characters might deal
with them.  The default is refuse, so that the danger that some
participants see is minimized: there's no ambiguity, but there is a
failure.  The alternative is to use a mechanism provided by the
registry in order to decide what to do.  Specifying this is also
required immediately in order that we proceed.

Finally, we write a document that outlines how a registry (== zone
operator) might deal with these characters.  The document outlines
what bundling, if any, can be done for various characters.  It also
specifies a format by which a registry can publish its policy on what
mapping happens.  The location of the policy is places in an SRV or
NAPTR or some such record.  That allows a client (any client) to find
the policy as published by the zone operator.  Then the client can
look up the policy document, find the character in question and see
how it is mapped.  

Clients can, in this way, be tuned appropriately if desired by local
users, but they can ship by default with a policy that is tightly
closed.  As registries come up with mappings, they can publish them
and the clients that understand this mechanism can react
appropriately.  Over time, the reaction might even be different (as
users' expectations come to be different) without us needing to
re-open the protocol.

This document does not have to come with the rest of IDNA2008, because
the default "closed" policy on these TRANSITIONAL characters means
that they just don't work to begin with.

The mechanism suffers from some obvious flaws.  First, we're adding
another lookup plus the obtaining of some policy document to a
resolution context.  I think this is partly solvable by putting
expirations on documents and by using longish TTLs on the records.
We're also adding the potential for a lot of never-to-be-satisfied
NAPTR or whatever lookups against authority servers, and that's not
nothing.  It's another pile of stuff to specify, and so these
characters become delayed while we hammer out this specification.  It
requires the invention of a way to specify mappings in an
easy-to-publish and machine-readable format.  It requires more
infrastructure by sites that want to use the mechanism.

Nevertheless, it does offer the possibility that both ends of the
communication can establish (securely, if this is done with DNSSEC for
the lookup and TLS for obtaining the policy) what the situation is
with respect to characters in the zone.  This would allow, I think,
more sophisticated client-side mapping that is desirable to some
communities where that sophisticaed mapping could be co-ordinated,
while yet providing a stable and predictable default as some are
arguing is necessary.  (Once we had the mechanism, we might use it for
other purposes too -- I originally thought of something like this in
an effort to attack the "cookie problem", but nobody seemed interested
so I haven't pursued it.)

Again, I am aware this is a very sketchy suggestion right now.  That
stipulated, does it sound remotely sane?

A

-- 
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.