prohibiting previously mapped and unmapped characters

Erik van der Poel erikv at
Wed Nov 29 20:21:08 CET 2006

Some members of the design team may have made such assumptions, but I
only have the Internet Draft to look at:

Note the "clicking on a URI" in section 2.2.1 and the "label
rejection" in 2.2.3. Also note the "No" next to FF00..EF on page 18

Am I misinterpreting the Internet Drafts? Also, I am not only
concerned about what can actually occur on the wire in a DNS packet. I
am also concerned about the html that goes on the wire. Am I alone in
this concern?


On 11/29/06, Mark Davis <mark.davis at> wrote:
> I think one of the background assumptions for this effort is to focus on
> identifying the allowed "output" characters, not the "input" characters.
> That is, full width A-Z are already disallowed in the *output* of IDNA, so
> this would have no change from that.
> In retrospect, we really shouldn't have had the transformation embodied in
> IDNA, just what can actually occur "on the wire".
> Mark
> On 11/29/06, Erik van der Poel <erikv at> wrote:
> >
> > Hello everyone,
> >
> > It's great to see so much energy in the idna200x efforts!
> >
> > One of my concerns is that it may be too late to try to prohibit some
> > of the characters that were previously permitted by rfcs 349[0-2],
> > whether mapped or unmapped in the normalization and case-folding
> > processes. One example that comes to mind is the full-width latin
> > range U+FF01..5E and another is the cjk iteration mark U+3005.
> >
> >
> >
> > Some may decide, after a close reading, that the old rfcs do not allow
> > non-punycode domain names in html, but the fact of the matter is that
> > these do occur. Now that even the market-leading web browser (msie)
> > has a version out that supports these (v7), it may become increasingly
> > difficult to convince some implementors to prohibit characters that
> > actually occur in the wild.
> >
> > (3rd w in
> is full-width)
> >
> > If it would help, I can take a look at Google's copies of web
> > documents to see which characters are actually used there and how many
> > occurrences there are of each. Of course, such a sample would omit
> > domain names used in email, but the web is quite an important part of
> > the Internet too.
> >
> > Erik van der Poel
> > Google
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at
> >
> >

More information about the Idna-update mailing list