looking up domain names with unassigned code points

Shawn Steele Shawn.Steele at microsoft.com
Sat May 10 06:54:44 CEST 2008


> Now, as we try to accommodate ZWJ and other characters in IDNA2008, we
> find that we can no longer assume that those LDH characters will
> guarantee that old software will look up the domain name. In a sense,
> IE7 missed one of the main points of the design of IDNA2003.

That's sort of irrelevent at this point :)  IE uses the normalization component, which I expect to be updated fairly soon after a new spec is written.  Unless the new standard goes beyond the Punycode form and mapping/normalization steps in 2003, I'm hoping that we can just swap out the component.  Of course some users won't get the benefit for some time, but I'm hopeful that a large number of users can take advantage of the new standard within a reasonable period after its release.

> However, this is not the point I'm trying to make. I'm saying that all
> clients should at least look up any label that is already in Punycode,
> so that it will be easier to make future changes to IDNA, such as
> I<heart>NY, which may not be deemed problematic in the future. The
> argument that people cannot type such characters has begun to carry
> less meaning now that many users click on search results.

Given the security fuss with the introduction of IDNA2003, the browsers opted to permit only the permitted names and exclude the "illegal" ones, which seems like a sensible approach given the negative feedback.  Its also completely unclear to me where the standard says that one should assume Punycode is safe and just use it.  On the contrary, I recall that there were words disallowing illegal xn-- constructs that weren't valid punycode (granted Punycode is superset of IDNA, but still.)

- Shawn


More information about the Idna-update mailing list