Unicode properties

Harald Alvestrand harald at alvestrand.no
Sat Jan 12 10:23:08 CET 2008


Mark Davis skrev:
> BTW, over the holidays I updated http://unicode.org/cldr/utility/ so
> that it now has the idna2003 properties. For example:
>
> |[:idna=output:]|
> <http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B:idna=output:%5D>
> The set of all characters allowed in the output of IDNA.
>
> These can be used for quick comparisons of the idna bis tables. For
> example, to get the idna characters that were allowed, but would be
> excluded by various of the idnabis table steps, one can go to
> http://unicode.org/cldr/utility/list-unicodeset.jsp and put in the
> regex-like expression:
>
> [[:idna=output:]
>   -[[:L:][:Nd:][:Mn:][:Mc:]
>   -[:^isCaseFolded:]
>   -[:NFKC_QuickCheck=NO:]
>   -[:Default_Ignorable_Code_Point:]]
>  -[-A-Z]]
>
> (Perl syntax works if you like that better.)
>
> Comments are welcome.
This seems very useful - it allows us very quickly to see which
characters we're talking about.

Thanks!



More information about the Idna-update mailing list