IDNNever.txt
Erik van der Poel
erikv at google.com
Sat Feb 3 18:29:20 CET 2007
On 2/2/07, Kenneth Whistler <kenw at sybase.com> wrote:
> For conservative criteria for what to absolutely, positively
> guarantee are in the never, never, ever category, I have
> started with:
>
> 1. cp != NFKC(cp)
> 2. cp has Pattern_Syntax property
> 3. cp has Pattern_White_Space property
> 4. cp has White_Space property
> 5. cp has Variation_Selector property
> 6. cp has Noncharacter_Code_Point property
> 7. cp has General_Category=Cf (Unicode format controls)
> 8. cp has General_Category=Cc (ISO controls)
I see that the slash look-alikes (U+2044 and U+2215) are included in
IDNNever thanks to the Pattern_Syntax property:
http://www.unicode.org/Public/UNIDATA/PropList.txt
If you are going to include cp != NFKC(cp), you might as well include
cp != case_fold(cp) for cp > U+007F, since HTML user agents are
performing both of these operations (NFKC and case-folding).
I'm curious to see how John is going to word the IDNA input vs output issue.
Erik
More information about the Idna-update
mailing list