Change of the algorithm
Paul Hoffman
phoffman at imc.org
Sat Mar 15 23:12:59 CET 2008
At 6:04 PM -0400 3/15/08, Patrik Fältström wrote:
>The rationale is that a codepoint that is in any of the the
>categories B, C and D should be DISALLOWED -- if there is no
>exception or if it is US-ASCII. Regardless of whether it is part of
>any of the other categories.
OK, but...
>Maybe I am thinking wrong here, but there are things that are in C
>and I (i.e. both). My take is that those codepoints should be
>DISALLOWED. Not CONTEXTO.
The definition of C is:
C: property(cp) is in {Default_Ignorable_Code_Point, White_Space,
Noncharacter_Code_Point}
. . .
The definition for Default_Ignorable_Code_Point can be found in
DerivedCoreProperties.txt [1] (and erratum of 2007-January-25 [2])
and is
Other_Default_Ignorable_Code_Point + Cf + Cc + Cs
+ Noncharacter_Code_Point + Variation_Selector
- White_Space - FFF9..FFFB (Annotation Characters)
That means that C contains all of {Cf}, other than white space and
annotation characters
The definition of I is:
I: generalCategory(cp) is in {Cf}
So, putting the check for a character in C before the check for a
character in I means that the check for the character in I will never
happen if the character is in {Cf}. So, there is no need to define I,
and nothing will ever be CONTEXTO. I don't think that is what you
wanted.
More information about the Idna-update
mailing list