looking up domain names with unassigned code points
Erik van der Poel
erikv at google.com
Sat May 10 01:37:03 CEST 2008
> > In my opinion, it's a good thing that MSIE7
> > refuses to look up Unicode labels with unassigned code points, but
> > it's bad that it also refuses to look up Punycode labels that encode
> > unassigned code points.
> What's the difference? I really don't expect links to be punycode, and
> noone's going to enter them that way, so it seems much more
> interesting to look up recently added (after 3.2) unicode code points.
Most links with non-ascii hosts are currently encoded in punycode. The
main reason for this is that MSIE6 does not support IDNA2003.
Also, I'm not talking about code points that have been assigned since
Unicode 3.2 came out. I'm talking about code points that are currently
still unassigned, but may be assigned to uppercase letters in the
future. An IDNA pre-processing implementation would not know how to
lowercase it before punycoding it, because it doesn't even know that
it will be uppercase.
> > There seem to be at least 2 camps with regard to the unassigned issue:
> > those that want to allow such lookups, so that "old" clients continue
> > to "work" when newly assigned code points are used
> But it takes FOREVER for updates and service packs to trickle down to
> the end user for things like IE.
Yup, and thanks to IE7's refusing to lookup punycode labels that
encode ZWJ, it is now difficult to introduce ZWJ in IDNA2008 without
adding a new prefix (in addition to the existing xn--). Maybe you
missed that discussion too. Anyway, I'm not suggesting that we add a
new prefix to the mix -- I'm just saying that IE7's implementation is
now making things a bit more difficult than they would otherwise be.
> In the meantime the person with the "new"
> domain name can't be guaranteed that it works for anyone.
It would work in IE6 and Firefox if they used punycode.
Anyway, it sounds like you're in the "old" client camp, and I can
certainly see how someone would end up thinking that way. Maybe we
just have to agree to disagree here.
I'm quite curious to hear from John regarding lookup of labels that
are already in Punycode. He mentioned earlier that this working group
would probably have to address this issue at some point...
More information about the Idna-update