Early look at draft-idnabis-issues-00d

Simon Josefsson jas at extundo.com
Mon Nov 6 15:56:22 CET 2006


I'm still digesting this document...

First, just a question: what is "Stable NFKC"??  Any reference? It
seems like this will be the essential contribution of IDNA200x.

Second, a suggestion: discuss the move from Unicode 3.2 to Unicode 5.0
more prominently, and also the problems stemming from that.  The
added characters from Unicode 5.0 is a major new feature, so it
should be more visible.  There is one problem in handling the NFKC
breakage that the UTC introduced after Unicode 3.2 -- the PR29 change
-- but those strings can be detected and prevented by IDNA200x.  I
can describe how LibIDN does this separately, if there is interest in
that approach.

While reading the document, I've noticed some areas that could improve
the document:

The IDNA model flow in section 2 should be improved to make it clear
that all-ASCII inputs, and some Unicode input strings, are converted
to ASCII hostnames in the DNS.  In other words, at least with
IDNA2003, not all inputs generate a punycode'd output string.  The
section currently gives the impression that all strings are punycode
encoded; I suspect this is just sloppy use of terminology.

Specifically, section 2.1.7 should permit that punycode is not used at
all, and section 2.1.8 should say not say that the string has to  be

The same problem is in section 2.2 -- not all IDN's are punycode

The way the term "punycode string" is used in section 5.1 indicate a
misunderstanding of what punycode is.  (This may also explain the
above flaw).  Punycode is an encoding of unicode, comparable to, say,
UTF-7.  Instead of "a Punycode string", I think you mean "ASCII- 
encoded IDN" or similar.

I really like section 5, it makes it clear what backwards compatible
changes we can do, and which we cannot do.  It may need further
tweaking, but it is useful section.

Concluding, while there are some useful generic discussions and
concerns, it seems this document needs quite some work until it is
close to becoming something that is implementable.  It's difficult to
discuss IDNA2003 vs IDNA200x until the details are fleshed out.


PS.  I posted this several weeks ago, but it didn't arrive in the
archives, so it was probably filtered out.

More information about the Idna-update mailing list