Impact of Punycode

Andrew Sullivan ajs at shinkuro.com
Fri Mar 26 01:47:23 CET 2010


On Fri, Mar 26, 2010 at 12:23:05AM +0000, Shawn Steele wrote:
> An AD server serves UTF-8 machine names, such as in an Intranet.
> The problem isn't the AD server per-se, but what is a client
> supposed to do when it gets a non-ASCII address?  Query UTF-8?
> Query Punycode?  Both?

This is no change whatever from how the application had to work
before, because when it received a series of 8-bit labels it had to
know how to treat those 8-bit labels.  Non-ASCII labels are clearly
not in hostname format, and so they have to be interpreted by the
application which has to know what to do with that label.  Nothing has
changed here.

> What's a UTF-8 aware server supposed to do when it gets a punycode
> address that matches a UTF-8 address that it has registered?

See above.

> What's the server supposed to do if it gets a Unicode address that
> matches a registered punycode address?

See above.  I won't keep repeating.

The problem with these "UTF-8 aware servers" you're talking about is
that the 8 bit addresses did not have a well-defined interoperating
behaviour, because the early DNS specifications made the labels 8
bits, but said that everyone should use the 7 bit LDH syntax.
Interoperation says that if you plan to work on the Internet with a
wide variety of different systems, you should stick with behaviour
that is well-defined, well-understood, and interoperable (this is the
"conservative in what you send" part of the principle).

> If I do a DNS query from my Intranet, how am I supposed to know if
> I'm querying UTF-8 or Punycode?

I guess you need to know what software you're running.  And no, of
course, end users don't need to know that.  That's why system vendors
need to be careful about shipping things that don't really work well
given the protocols and operational practices already in place.

> The only thing close to guidance I've heard is that DNS absolutely
> should NOT respond to UTF-8 requests that match a punycode record.
> Which would seem to me to be about the only possible workaround for
> the disparity (try to treat them the same).

Where do you read that?  I think the advice in the documents is that a
conforming implementation doesn't give DNS answers to U-labels, but to
A-labels.  Nothing there impinges in any way on the ability of an
operator to operate with 8 bit labels (which are not U-labels, as is
         made clear in the definitions document).

> Sorry for the ranting.  I feel like when I mention that IDNA might
> not be a "perfect" approach people respond like I'm delusional.
> Maybe if I saw only the pure Internet root level DNS old-ASCII side,
> it would be an elegant solution, but from where I sit punycode is a
> pretty ill-fitting hack and is causing lots of pain for lots of
> administrators, developers, and users.

Sure.  IDNA is a way of hacking non-ascii characters in to a
nearly-universally-deployed protocol that was implemented in many
cases by people who didn't read the specifications, didn't read or
couldn't find all of them, or who decided to do something that was
maybe strictly legal but probably likely not to work that well with
existing systems.  The alternative appears to be DNSng and a forklift
upgrade of the Internet.  To the extent we're trying to serve users,
that approach seems less fruitful.

A


-- 
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.


More information about the Idna-update mailing list