Allowed characters (was: Re: Casefolding Sigma (was: Re: IDNAbis Preprocessing Draft)

Mark Davis mark.davis at icu-project.org
Wed Mar 26 20:09:08 CET 2008


The tables document isn't yet done, but to get an approximate picture, do
the following.

To compare IDNA2003 Input with IDNA2008

Go to http://unicode.org/cldr/utility/unicodeset.jsp

In Input A, put
[:^idna=disallowed:]

In Input B, put
[[:L:][:Mn:][:Mc:][:Nd:]
-[:^isCaseFolded:]
-[:NFKC_QC=N:]
-[:di:]
-[[:block=Combining_Diacritical_Marks_for_Symbols:]
  [:block=Musical_Symbols:]
  [:block=Ancient_Greek_Musical_Notation:]
 ]
]

Hit Compare.

If you want detailed listing of differences, hit Only in A or Only in B,
respectively.

If you want to compare IDNA2003 output (instead of input), change A to
[:idna=output:]

If you want to see the IDNA2008 restricted to Unicode 3.2, add the following
before the last ] in B
-[:^age=3.2:]

Mark

On Wed, Mar 26, 2008 at 11:30 AM, Michael Everson <everson at evertype.com>
wrote:

> I'm really just trying to find out which characters in the Arabic
> block are "in" and which are "out". Just letters and diacritics.
> --
> Michael Everson * http://www.evertype.com
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>



-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080326/38559a96/attachment.html


More information about the Idna-update mailing list