More detail: a sketchy idea for expressing zone policy

Wed Dec 9 17:46:01 CET 2009

Dear colleagues,

I said I'd try to provide a more complete sketch of what I was
imagining for how an IDNA client could discover what sorts of policies
control the data at a given IDNA name.  This message is an attempt to
provide some more details.

I want to emphasise that (1) I am perfectly content if people want to
yell at me about how awful this is, what a filthy kludgey mess this
is, and that nobody should ever do this; and (2) I am perfectly aware
that there were previous discussions about scope limitation and what a
bad idea all that is.  I am unsure how to solve the current impasse
without all this horror (perhaps I'm just a pessimist); certainly
every other proposal I've seen entails a flag day for introducing the
controversial characters to the IDNA namespace.  However gentle the
dawn may be, I regard a flag day as a nasty answer, and I want to find
something else. 

The idea is that a zone operator publishes some rules about the zone's
policy on certain characters in the zone.  Suppose the delegating zone
is example.org; then a client wanting to learn about example.org's
IDNA policy rules makes an SRV query for _idnarules.example.org.

If example.org actually has a policy, the answer to that SRV query is
a URI where the policy is stored.  (An empty answer or Name Error
means there is no policy.  More on what to do with this below.)  The
policy is expressed in some format TBD (I guess some sort of XML
document or something like that).  The policy expresses how a
particular Unicode code point is handled (it can contain ranges if
need be).

The policy can be that a given character is treated as PVALID with no
restriction.  In this case, the character is handled much the way
color and colour are treated in (e.g.) .com today: it's just part of a
label that is unrelated to any other label.

The policy can be that a given character is one-way bundled with
another one, or with some other combination.  So, for instance, the
policy can say that ö is bundled with oe, or that ß is bundled with
ss.  One-way bundling means that any instance of that character
entails that the other character(s) is also used, but that the bundle
target does not always entail the bundled character (so e.g. we don't
automatically get eißtraße from eisstrasse).

The policy can be that a given character is two-way bundled, which
makes two different characters or combinations of characters
equivalent.

The policy can also contain information about the zone's historic
policies.  For instance, if the zone never registered any German
characters under IDNA2003 rules, that information could be put into
the policy.  (This would not, of course, guarantee that nobody ever
registered anything in the zone that was intended to be resolved by
using, say, ß; but it would be a good indication that the zone
operator set expectations with delegates that German IDNA names
weren't really practical.)

The policy has an initiation time and an expiry time.  These times are
absolute times.  I'm not real excited about using absolute times this
way, but if DNSSEC requires it anyway so I don't think that requiring
absolute time is a disaster.  The validity period may be long (perhaps
months) so that clients that use the policy can save it, and re-use it
later.  This avoids constant additional lookups at the cost of
flexibility.  The idea is that registration policies don't change that
often, so a long-lived policy is ok.

A client can look up the policy on first (IDNA) resolution in the
zone.  If the client wants to provide backward-compatible behaviour
(e.g. to guarantee that ß is never registred by anyone except the
registrant of another name spelled the same, only with ss), then it
can insist on a minimum one-way bundling rule for the character in
question.  Without such a rule available, the client will refuse to
lookup the "IDNA2008-style" name.  This provides the failure mode that
Mark in particular suggested was better than ambiguity.

At the same time, the client can be configured to accept different
policies under different circumstances, so that the gradual adoption
of the controversial characters becomes possible in limited
circumstances.  This means that if a user community is familiar with
the actual use of (say) final sigma, it can get the benefits of the
character in a zone publishing its policies as soon as members of the
community have new software.  Everyone in the world doesn't have to
wait for "flag day" to roll around.

I think it wouldn't be a bad idea if clients that know about policy
treat the controversial characters just as under IDNA2003
(i.e. they're mapped) in the complete absence of a policy.  I am
susceptible to argument on this point, however.

I'm not sure if this still sketchy suggestion helps at all, but since
I said yesterday I would try to say something more, I thought I'd
better.  

Best regards,

A

-- 
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.