Question about idnabis-table-04

Patrik Fältström patrik at frobbit.se
Tue Dec 9 12:29:47 CET 2008


On 2 dec 2008, at 10.51, SUN Guonian wrote:

> from draft-ietf-idnabis-tables-04.txt, I could see these characters  
> are divided
> in two categories, PVALID and DISALLOWED.
>
> FA0E..FA0F  ; PVALID      # CJK COMPATIBILITY IDEOGRAPH-FA0E..CJK  
> COMPAT
> FA10        ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA10
> FA11        ; PVALID      # CJK COMPATIBILITY IDEOGRAPH-FA11
> FA12        ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA12
> FA13..FA14  ; PVALID      # CJK COMPATIBILITY IDEOGRAPH-FA13..CJK  
> COMPAT
> FA15..FA1E  ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA15..CJK  
> COMPAT
> FA1F        ; PVALID      # CJK COMPATIBILITY IDEOGRAPH-FA1F
> FA20        ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA20
> FA21        ; PVALID      # CJK COMPATIBILITY IDEOGRAPH-FA21
> FA22        ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA22
> FA23..FA24  ; PVALID      # CJK COMPATIBILITY IDEOGRAPH-FA23..CJK  
> COMPAT
> FA25..FA26  ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA25..CJK  
> COMPAT
> FA27..FA29  ; PVALID      # CJK COMPATIBILITY IDEOGRAPH-FA27..CJK  
> COMPAT
> FA2A..FA2D  ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA2A..CJK  
> COMPAT
> FA30..FA6A  ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA30..CJK  
> COMPAT
> FA70..FAD9  ; DISALLOWED  # CJK COMPATIBILITY IDEOGRAPH-FA70..CJK  
> COMPAT
>
> What rule makes these characters U+FA10, U+FA12, ... DISALLOWED ? I  
> am not sure
> if they fallen under DISALLOWED because U+FA10 == U+585A and U+FA12  
> == U+6674,
> ... respectively.
>
> Maybe I missed some documents about this. Could you give me more  
> hints ?

You saw the response from Ken I hope.

To answer from an IDNA perspective, U+FA10 (CJK COMPATIBILITY  
IDEOGRAPH-FA10) matches rules LetterDigits and Unstable in the tables  
document, i.e. it matches these two:

2.1.  LetterDigits (A)

    A: generalCategory(cp) is in {Ll, Lu, Lo, Nd, Lm, Mn, Mc}

2.2.  Unstable (B)

    B: toNFKC(toCaseFold(toNFKC(cp))) != cp

If you then look at the series of rules to calculate the value, we  
have this:

    1.   If the code point is in Exceptions (Section 2.6), the value is
         according to the table in Section 2.6.
    2.   If the code point is in BackwardCompatible (Section 2.7), the
         value is according to the table in Section 2.7.
    3.   If the code point is in Unassigned (Section 2.10), the value is
         UNASSIGNED.
    4.   If the code point is in LDH (Section 2.5), the value is PVALID.
    5.   If the code point is in JoinControl (Section 2.8), the value is
         CONTEXTJ.
    6.   If the code point is in Unstable (Section 2.2), the value is
         DISALLOWED.
    7.   If the code point is in IgnorableProperties (Section 2.3), the
         value is DISALLOWED.
    8.   If the code point is in IgnorableBlocks (Section 2.4), the  
value
         is DISALLOWED.
    9.   If the code point is in OldHangulJamo (Section 2.9), the value
         is DISALLOWED.
    10.  If the code point is in LetterDigits (Section 2.1), the value  
is
         PVALID.
    11.  If the code point is not in LetterDigits (Section 2.1), the
         value is DISALLOWED.

We find that when going through these rules one by one, we get a match  
on rule 6, and the value to DISALLOWED.

     Patrik



More information about the Idna-update mailing list