looking up domain names with unassigned code points

Shawn Steele Shawn.Steele at microsoft.com
Sat May 10 02:28:10 CEST 2008

> Yup, and thanks to IE7's refusing to lookup punycode labels that
> encode ZWJ, it is now difficult to introduce ZWJ in IDNA2008 without
> adding a new prefix (in addition to the existing xn--). Maybe you
> missed that discussion too. Anyway, I'm not suggesting that we add a
> new prefix to the mix -- I'm just saying that IE7's implementation is
> now making things a bit more difficult than they would otherwise be.

That doesn't seem like the only thing that's broken in that scenario.  The user couldn't read the resulting URL, etc, and we've been trying to tell them to be wary of URLs that look like gibberish.  Also I'm a bit confused because I thought the idea of the query was to query for *new* Unicode characters, not those that were already decided to be illegal in a name.

My thinking was more along the lines that if we decided not to include existing Unicode characters, then the casing rules, etc would maybe be possible for those.  New ones to Unicode would be problematic.

The biggest problem is probably "just" that IDNA2003 was a v1, so there're now a few kinks to work out, and its so far behind Unicode 5.1.  If it could be brought to a somewhat current version of Unicode, then querying for additional characters wouldn't be as interesting.  (Because the OS still needs font support and all that as well).

- Shawn

More information about the Idna-update mailing list