More detail: a sketchy idea for expressing zone policy
Shawn.Steele at microsoft.com
Wed Dec 9 18:20:23 CET 2009
IMO the interesting part of these characters is preserving the form, not making a distinction between them. If we were going to invent a completely new system that is complicated, I'd much prefer "just" providing a mechanism to indicate the preserved form.
One problem with the per-zone rules is complexity and whether or not people would adopt it.
Another problem is that it doesn't help when processing disconnected strings. If I wanted to validate a URL in an address book, I'd have to look it up.
What happens when the zone changes the policy? (Or adds a policy when it suddenly becomes aware of the option?)
If we were going to provide rules that effectively cause bundling per-zone for certain characters, then I'd think about extending it to some of the other sequences discussed on the list. Eg: If I want to bundle ß and ss, then maybe I also want to bundle ö and oe. Or maybe neither of those, but ı and I.
From: idna-update-bounces at alvestrand.no [idna-update-bounces at alvestrand.no] on behalf of Andrew Sullivan [ajs at shinkuro.com]
Sent: Wednesday, December 09, 2009 8:46 AM
To: idna-update at alvestrand.no
Subject: More detail: a sketchy idea for expressing zone policy
I said I'd try to provide a more complete sketch of what I was
imagining for how an IDNA client could discover what sorts of policies
control the data at a given IDNA name. This message is an attempt to
provide some more details.
I want to emphasise that (1) I am perfectly content if people want to
yell at me about how awful this is, what a filthy kludgey mess this
is, and that nobody should ever do this; and (2) I am perfectly aware
that there were previous discussions about scope limitation and what a
bad idea all that is. I am unsure how to solve the current impasse
without all this horror (perhaps I'm just a pessimist); certainly
every other proposal I've seen entails a flag day for introducing the
controversial characters to the IDNA namespace. However gentle the
dawn may be, I regard a flag day as a nasty answer, and I want to find
The idea is that a zone operator publishes some rules about the zone's
policy on certain characters in the zone. Suppose the delegating zone
is example.org; then a client wanting to learn about example.org's
IDNA policy rules makes an SRV query for _idnarules.example.org.
If example.org actually has a policy, the answer to that SRV query is
a URI where the policy is stored. (An empty answer or Name Error
means there is no policy. More on what to do with this below.) The
policy is expressed in some format TBD (I guess some sort of XML
document or something like that). The policy expresses how a
particular Unicode code point is handled (it can contain ranges if
The policy can be that a given character is treated as PVALID with no
restriction. In this case, the character is handled much the way
color and colour are treated in (e.g.) .com today: it's just part of a
label that is unrelated to any other label.
The policy can be that a given character is one-way bundled with
another one, or with some other combination. So, for instance, the
policy can say that ö is bundled with oe, or that ß is bundled with
ss. One-way bundling means that any instance of that character
entails that the other character(s) is also used, but that the bundle
target does not always entail the bundled character (so e.g. we don't
automatically get eißtraße from eisstrasse).
The policy can be that a given character is two-way bundled, which
makes two different characters or combinations of characters
The policy can also contain information about the zone's historic
policies. For instance, if the zone never registered any German
characters under IDNA2003 rules, that information could be put into
the policy. (This would not, of course, guarantee that nobody ever
registered anything in the zone that was intended to be resolved by
using, say, ß; but it would be a good indication that the zone
operator set expectations with delegates that German IDNA names
weren't really practical.)
The policy has an initiation time and an expiry time. These times are
absolute times. I'm not real excited about using absolute times this
way, but if DNSSEC requires it anyway so I don't think that requiring
absolute time is a disaster. The validity period may be long (perhaps
months) so that clients that use the policy can save it, and re-use it
later. This avoids constant additional lookups at the cost of
flexibility. The idea is that registration policies don't change that
often, so a long-lived policy is ok.
A client can look up the policy on first (IDNA) resolution in the
zone. If the client wants to provide backward-compatible behaviour
(e.g. to guarantee that ß is never registred by anyone except the
registrant of another name spelled the same, only with ss), then it
can insist on a minimum one-way bundling rule for the character in
question. Without such a rule available, the client will refuse to
lookup the "IDNA2008-style" name. This provides the failure mode that
Mark in particular suggested was better than ambiguity.
At the same time, the client can be configured to accept different
policies under different circumstances, so that the gradual adoption
of the controversial characters becomes possible in limited
circumstances. This means that if a user community is familiar with
the actual use of (say) final sigma, it can get the benefits of the
character in a zone publishing its policies as soon as members of the
community have new software. Everyone in the world doesn't have to
wait for "flag day" to roll around.
I think it wouldn't be a bad idea if clients that know about policy
treat the controversial characters just as under IDNA2003
(i.e. they're mapped) in the complete absence of a policy. I am
susceptible to argument on this point, however.
I'm not sure if this still sketchy suggestion helps at all, but since
I said yesterday I would try to say something more, I thought I'd
ajs at shinkuro.com
Idna-update mailing list
Idna-update at alvestrand.no
More information about the Idna-update