Impact of Punycode

Shawn Steele Shawn.Steele at microsoft.com
Fri Mar 26 01:23:05 CET 2010


An AD server serves UTF-8 machine names, such as in an Intranet.  The problem isn't the AD server per-se, but what is a client supposed to do when it gets a non-ASCII address?  Query UTF-8?  Query Punycode?  Both?  
What's a UTF-8 aware server supposed to do when it gets a punycode address that matches a UTF-8 address that it has registered?
What's the server supposed to do if it gets a Unicode address that matches a registered punycode address?
What's a UTF-8 aware server supposed to do when it gets a punycode address that would match a UTF-8 address if that address had been mapped per IDNA?
What's a UTF-8 aware server supposed to do if someone wants to register a machine name that is equivalent to a different machine named with a punycode name?
Or a mapped version?
If I do a DNS query from my Intranet, how am I supposed to know if I'm querying UTF-8 or Punycode?
If nothing else, the potential for hijacking (both Punycode and Unicode machines on the same network) is extreme.

Not DNS so much, but management stuff:  What're the middle layers supposed to do when it sees punycode and Unicode forms of the same address?

The only thing close to guidance I've heard is that DNS absolutely should NOT respond to UTF-8 requests that match a punycode record.  Which would seem to me to be about the only possible workaround for the disparity (try to treat them the same).

So the "break" is that this is terribly confusing to admins, causes odd mixes of machines, and makes it really hard to support IDN and legacy stuff at the same time without breaking anything.

Punycode "works" right now in browsers because the browsers wrote lots of code to deal with it.  Some other client apps handle it also.  The behavior is often inconsistent (try a Unicode/eg UTF-8 query, then try a Punycode query, and other apps do it in the other order.)  There are lots of applications that haven't even tried to figure out what to do, and others that don't even realize they're broken.

Sorry for the ranting.  I feel like when I mention that IDNA might not be a "perfect" approach people respond like I'm delusional.  Maybe if I saw only the pure Internet root level DNS old-ASCII side, it would be an elegant solution, but from where I sit punycode is a pretty ill-fitting hack and is causing lots of pain for lots of administrators, developers, and users.

Even so, I'm trying to figure out how ease the pain, you haven't heard me advocate abandoning IDNA, I just don't want the same mistakes extended to other areas/protocols.  EAI seems to be a reasonable effort in that regard.  (& yes, I've thought about punycode as a TEMPORARY very-worst-last-case-extreme-use-at-your-own-risk fallback at the very extreme ends of the system (client/mailbox server) for EAI, but at least it goes away as adoption of UTF8SMTP increases).

-Shawn

-----Original Message-----
From: idna-update-bounces at alvestrand.no [mailto:idna-update-bounces at alvestrand.no] On Behalf Of Paul Hoffman
Sent: Poʻahā, Malaki 25, 2010 4:49 PM
To: Shawn Steele
Cc: idna-update at alvestrand.no
Subject: RE: Impact of Punycode

At 10:35 PM +0000 3/25/10, Shawn Steele wrote:
>IDN breaks AD DNS servers, which were using octets as permitted.

Say what!?!? When we did the first IDNA interop many years ago, we had at least AD servers in the mix. No one reported any problem.

>  Perhaps some folks don't consider those "real" DNS servers, however they are a scenario some folks now have to reconcile.

All DNS servers count. Please send details about what "breaks" means above, and sexamples of where it happened.
_______________________________________________
Idna-update mailing list
Idna-update at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list