UTF-8

Andrew Sullivan ajs at shinkuro.com
Thu Jun 17 22:14:37 CEST 2010


On Thu, Jun 17, 2010 at 08:02:18PM +0000, Shawn Steele wrote:
> 
> And, FWIW, if I were building a name server, I'd let it accept UTF-8 requests (They'd have to be U-labels, so the server'd have to use the UTS#46 mappings like any client would, however it wouldn't matter as long as the rules were consistent).

If you were building a nameserver that way, you'd be doing it wrong.
DNS is _already_ 8-bit clean, and always was.  It's right there in the
definition in RFC 1034 and 1035.  _Any_ octet is allowed in DNS
labels.

The problem is that those aren't allowed in registerable domain names,
which are subject to hostname restrictions defined outside the DNS.
These are really policy matters, and not protocol matters, but owing
to a long history, the distincion was not always understood by
implementers and so we ended up with a lot of rules that were in fact
policy matters getting enshrined in "protocol" broadly
(mis)understood.

Indeed, the reason so-called UTF-8 "native" labels and other such
stuff all sort of works in a lot of places is exactly _because_ the
DNS was designed with the possibility in mind that the world would
leave 7-bit ASCII restrictions behind.  

A

-- 
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.


More information about the Idna-update mailing list