looking up domain names with unassigned code points

Frank Ellermann hmdmhdfmhdjmzdtjmzdtzktdkztdjz at gmail.com
Thu May 15 23:17:38 CEST 2008


Shawn Steele wrote:

> I surf for one of the web-based ones when I need to convert.

+1.  Used to an Unicode-ignorant platform it even took me some
time to figure out how to produce an unassigned UTF-8 code for
a funny "I <unassigned black telephone dingbat> Unicode" title
of a blog entry in reply to Mark's 5.1 article.

Some Wordpad Alt-X magic finally did the trick - using an NCR
produced a feed validator warning, and warnings are bad news.

> Consider also the IMA/EAI UTF-8 effort.  Clients that get a
> pretty UTF-8 name aren't going to be able to process that
> name if they can't do the punycode conversion to do the DNS
> query because they're on the wrong version of IDN.  Sure,
> they can (hopefully) use the fallback address, but I suspect
> that'll have a human readable name, not a punycode string.

AFAIK that affects only the RHS (domain part), for the LHS
(local part) knowing what the UTF-8 means is not essential:
The MSA can be (again) a "smart host" wrt EAI, and clients
aren't forced to support punycode if they can handle UTF-8.
The same goes for say whois servers vs. whois clients.

 Frank



More information about the Idna-update mailing list