Distributed configuration of "private" IDNA (Re: IDNA and getnameinfo() and getaddrinfo())

Thu Jun 17 23:28:04 CEST 2010

--On Thursday, June 17, 2010 16:57 -0400 Andrew Sullivan
<ajs at shinkuro.com> wrote:

>> DNS can't work interoperably with multiple IDN rulesets for
>> the simple reason that to do so would require code to decide
>> amongst IDN rules to apply in context-specific manners.  
> 
> Right.  See John Klensin's previous remarks about this: in
> small communities of well-known behaviour, your favourite
> encoding as octets in the zone work fine.  But given that we
> have multiple different encodings, we surely do have a
> problem.  It's nevertheless simply too late to say that the
> only thing anyone is allowed to put in a DNS zone is an
> A-label.  We don't get to reformat the Internet like that.  The
> DNS rules were established a long time ago, so there _is_
> non-A-label data in zone files already.

And that clearly applies to server-side application of UTR46 or
any other trick matching as well.  It isn't just that it
violates the spec (since the server-side matching rules for
octets are extremely clear), it is that some servers would be
extended to handle the special mapping, some would not, and one
couldn't tell the difference.  Even then, one would have to
assume that every server that did any mapping did it the same
way.  Despite a lot of interesting ideas and no matter how many
standards were approved by whatever bodies approved them, that
is profoundly unrealistic.   With or without different mapping
variations, "some map and some don't" could, in turn, easily
yield false positives, false negatives, and a collection of
"interesting" attack vectors.

>> If you really, really want this to work, then start thinking
>> about solutions along the lines of my strawman proposal for
>> an NS-like RR that indicates what IDN rules apply to
>> delegated zones.  I'd rather help make IDNA2008 better by
>> working on the APIs aspect of the problem.
> 
> I suggested similar things more than once over the past couple
> years, and people told me every time that I might be running
> for the position of "Bad Idea Fairy".

What causes the BIF problem is the combination of the
slightly-odd relationship between NS records and the RR sets to
which one wants the data to be bound ("slightly-odd" not because
the behavior isn't well defined but because it doesn't do what
one wants for this purpose).  If nothing else, the possible
error states when the NS and interpretation records contained
different information in the parent and child zones would be a
challenge -- well-defined, if sometimes surprising to the naive
for the NS case, but an interesting design challenge for the
"label interpretation" case, especially one remembers that a
cache would have to retrieve and maintain the interpretation
data on a zone by zone basis (probably not apparent-label by
apparent-label).  

The DNS works as well as it does partially because, while caches
have to follow a few specific rules (including those for
octet-level matching of labels in length-label pair form),
caches can be pretty dumb.  Asking caches to be smart and able
to reflect whatever matching rules the authoritative servers
(and/or their authoritative parents) think appropriate means
_really_ smart caches.

And then there is the DNAME possibility and the consequent need
for new primitives that authoritatively identify the tree in
which an FQDN target is really located.

If one wanted it to work, I suggest that one would want to start
by deprecating DNAME and maybe CNAME so that there was exactly
one way to access a particular DNS node.  Then one would need to
think about at least one of a new Label Type (my current
favorite), a new Class (probably not good enough, my early
proposal to that effect notwithstanding), or an EDNS0 option to
permit a client to differentiate among servers applying
different rules (as far as I know, not yet comprehensively
evaluated by anyone).  The three of those options have two
things in common:

	(i) Good luck getting them deployed soon enough and
	widely enough to do anyone any good. Think in decades.

	(ii) We would still be stuck with legacy A-labels in
	zones and the need to sort them out in applications.
	Some zones could be expected to at least stop adding
	more of them but those that were driven by either market
	or compatibility considerations would probably discover
	that they had to deploy every name according to both the
	old (IDNA A-labels) and new (e.g., UTF8 with UTR46-2025)
	conventions.   Synchronized domains anyone?  :-(

While you are at it, I'd like a pony.  Actually, I'd like a
whole corral full of ponies.

    john