IDNA2008 test vectors

Simon Josefsson simon at josefsson.org
Tue Mar 29 17:33:56 CEST 2011


Hi Mark,

I'm happy to report that libidn2 handles 116 of the positive test
vectors in http://www.unicode.org/Public/idna/6.0.1/IdnaTest.txt dated
29-dec-2010 with SHA-1 2fb11ede408fe7ab3e1c3b071d8c9c3f0de0d1fc.

Testing all negative test vectors (i.e., test vectors that fail) is more
cumbersome but I'll try to figure something out.

I'm now going through the remaining positive test vectors that failed
for some reason, and one of them that cought my eye is below.

Line 2387 of IdnaTest.txt reads:

B;	。;	;	;	

To me this means that the source input is U+3002, ToUnicode output is
U+3002, and ToASCII output is U+3002.  It seems weird that the ToASCII
output is a Unicode string and not an ACE string?!

According to RFC 5892 that code point is disallowed:

3000..3004  ; DISALLOWED  # IDEOGRAPHIC SPACE..JAPANESE INDUSTRIAL STAND

Is this a bug in IdnaTest.txt?

Cheers,
/Simon


More information about the Idna-update mailing list