Unicode properties

Mark Davis mark.davis at icu-project.org
Thu Jan 10 02:19:53 CET 2008


BTW, over the holidays I updated http://unicode.org/cldr/utility/ so that it
now has the idna2003 properties. For example:

[:idna=output:]<http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B:idna=output:%5D>The
set of all characters allowed in the output of IDNA.

These can be used for quick comparisons of the idna bis tables. For example,
to get the idna characters that were allowed, but would be excluded by
various of the idnabis table steps, one can go to
http://unicode.org/cldr/utility/list-unicodeset.jsp and put in the
regex-like expression:

[[:idna=output:]
  -[[:L:][:Nd:][:Mn:][:Mc:]
  -[:^isCaseFolded:]
  -[:NFKC_QuickCheck=NO:]
  -[:Default_Ignorable_Code_Point:]]
 -[-A-Z]]

(Perl syntax works if you like that better.)

Comments are welcome.

-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080109/e20e4cb7/attachment.html


More information about the Idna-update mailing list