IDNA 2008 Question Re: "Confusable" Characters in Domain Names

Nicolas Williams Nicolas.Williams at oracle.com
Mon Nov 8 17:29:52 CET 2010


On Sat, Nov 06, 2010 at 05:35:21AM -0400, Andrew Sullivan wrote:
> On Fri, Nov 05, 2010 at 10:58:30PM +0000, Shawn Steele wrote:
> > But the "zones" are DNS, and they define the rules for their zones.
> > There's no way any browser can tell what rules a particular zone is
> > using.
> 
> I think you're not making a distinction you may need.
> 
> Anyone looking up anything in the DNS can tell what rules everyone
> else is using: they're using RFC 1034 and RFC1035 and a larger or
> smaller set of subsequent RFCs that refine the way the DNS is used.
> Nothing in IDNA, or for that matter the policies encoded in RFC 1123
> or in ICANN's various agreements or in the CIRA .ca (or pick your
> favourite ccTLD) or the .name registration agreement or the rules
> about blogspot.com names or whatever else you like constrains any of
> that.  Moreover, if you want to set up shawnsteele.example.com and put
> ISO-8859-1 labels in the next level down, _that's also_ perfetly
> legitimate in the DNS, and will "work" in the sense that someone else
> who knows what kind of bitstring you have in that 8859-1 label will be
> able to interpret it.  

+1

> The distinction you need is that there is no way, in the DNS or,
> currently as far as I know, outside of it, to look up the policies for
> what code points would be acceptable U-label pieces in a U-label in
> that zone.  It might be the case that having such a mechanism would be
> good.  But we don't have it right now.  If we think making such a
> mechanism (or even just defining conventions for how to publish that
> policy) is important, then people should speak up.  The last couple
> times I proposed it, the reaction seemed to me that it wasn't
> important.  So I never bothered to write it up.

I think such a mechanism would be an unfortunate complication for DNS
clients -- not only would it add round trips to lookups, it'd also mean
that clients would have implement more codeset conversions/encodings.

I'd rather we avoid it as much as possible.  ISTM that the best way to
avoid it is to get IDNA2008 deployed.

Nico
-- 


More information about the Idna-update mailing list