Stop me if I've misunderstood...

Shawn Steele Shawn.Steele at microsoft.com
Fri Jul 10 01:53:56 CEST 2009


> So it's
> not quite right that microsoft.com and MICROSOFT.COM "don't compare
> equal in binary form".

They differ in the 0x20 bits.  Therefore a bit-by-bit (eg: binary) comparison has them differ.

> The protocol is quite specific that they _do_ in fact compare equal,
> even though they happen to have different 0x20 bits on the octets.

That's not quite true.  0x30-0x39 are permitted, 0x10-0x19 are not.  ASCII ignore-case is "mapping", it's just a very simple and easy to implement form.  (And doesn't DNS recommend lowercase now anyway?)  You can probably cheat by throwing out < 0x30 which takes care of 0x10-0x19 anyway, but it's still mapping.  If you ignore the 0x20 by OR 0x20, then your mapping one way (lower case).  If it's by AND 0xdf (or whatever) then it’s the other way (upper case).  You're still mapping, it just doesn't look like it :)

> As a matter of actual implementation, however, it's worth noting that
> in just about every real implementation in the wild, the server uses
> the form of the name _as supplied by the user_.

Yes, the mapping happens at a different layer.  That doesn't make comparison impossible, we just need consistent rules.

> This is because most name servers use the compression trick

Sorry, I don't know what the compression trick is :)

-Shawn



More information about the Idna-update mailing list