IDNA2008 test vectors
Mark Davis ☕
mark at macchiato.com
Tue Mar 29 20:54:30 CEST 2011
That looks like a bug, I'll check it out.
The ideographic period is allowed under IDNA2003, but should be mapping to
".". (Also, in the next revision, there will be a field that indicates that
the input isn't allowed under IDNA2008, so that people can distinguish them.
Mark
*— Il meglio è l’inimico del bene —*
On Tue, Mar 29, 2011 at 08:33, Simon Josefsson <simon at josefsson.org> wrote:
> Hi Mark,
>
> I'm happy to report that libidn2 handles 116 of the positive test
> vectors in http://www.unicode.org/Public/idna/6.0.1/IdnaTest.txt dated
> 29-dec-2010 with SHA-1 2fb11ede408fe7ab3e1c3b071d8c9c3f0de0d1fc.
>
> Testing all negative test vectors (i.e., test vectors that fail) is more
> cumbersome but I'll try to figure something out.
>
> I'm now going through the remaining positive test vectors that failed
> for some reason, and one of them that cought my eye is below.
>
> Line 2387 of IdnaTest.txt reads:
>
> B; 。; ; ;
>
> To me this means that the source input is U+3002, ToUnicode output is
> U+3002, and ToASCII output is U+3002. It seems weird that the ToASCII
> output is a Unicode string and not an ACE string?!
>
> According to RFC 5892 that code point is disallowed:
>
> 3000..3004 ; DISALLOWED # IDEOGRAPHIC SPACE..JAPANESE INDUSTRIAL STAND
>
> Is this a bug in IdnaTest.txt?
>
> Cheers,
> /Simon
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110329/77056cf4/attachment.html>
More information about the Idna-update
mailing list