IDNNever.txt

Erik van der Poel erikv at google.com
Sat Feb 3 18:29:20 CET 2007


On 2/2/07, Kenneth Whistler <kenw at sybase.com> wrote:
> For conservative criteria for what to absolutely, positively
> guarantee are in the never, never, ever category, I have
> started with:
>
> 1. cp != NFKC(cp)
> 2. cp has Pattern_Syntax property
> 3. cp has Pattern_White_Space property
> 4. cp has White_Space property
> 5. cp has Variation_Selector property
> 6. cp has Noncharacter_Code_Point property
> 7. cp has General_Category=Cf (Unicode format controls)
> 8. cp has General_Category=Cc (ISO controls)

I see that the slash look-alikes (U+2044 and U+2215) are included in
IDNNever thanks to the Pattern_Syntax property:

http://www.unicode.org/Public/UNIDATA/PropList.txt

If you are going to include cp != NFKC(cp), you might as well include
cp != case_fold(cp) for cp > U+007F, since HTML user agents are
performing both of these operations (NFKC and case-folding).

I'm curious to see how John is going to word the IDNA input vs output issue.

Erik


More information about the Idna-update mailing list