Allowed characters (was: Re: Casefolding Sigma (was: Re:
IDNAbis Preprocessing Draft)
Mark Davis
mark.davis at icu-project.org
Wed Mar 26 20:09:08 CET 2008
The tables document isn't yet done, but to get an approximate picture, do
the following.
To compare IDNA2003 Input with IDNA2008
Go to http://unicode.org/cldr/utility/unicodeset.jsp
In Input A, put
[:^idna=disallowed:]
In Input B, put
[[:L:][:Mn:][:Mc:][:Nd:]
-[:^isCaseFolded:]
-[:NFKC_QC=N:]
-[:di:]
-[[:block=Combining_Diacritical_Marks_for_Symbols:]
[:block=Musical_Symbols:]
[:block=Ancient_Greek_Musical_Notation:]
]
]
Hit Compare.
If you want detailed listing of differences, hit Only in A or Only in B,
respectively.
If you want to compare IDNA2003 output (instead of input), change A to
[:idna=output:]
If you want to see the IDNA2008 restricted to Unicode 3.2, add the following
before the last ] in B
-[:^age=3.2:]
Mark
On Wed, Mar 26, 2008 at 11:30 AM, Michael Everson <everson at evertype.com>
wrote:
> I'm really just trying to find out which characters in the Arabic
> block are "in" and which are "out". Just letters and diacritics.
> --
> Michael Everson * http://www.evertype.com
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
--
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080326/38559a96/attachment.html
More information about the Idna-update
mailing list