more fixes for Tables

Patrik Fältström patrik at frobbit.se
Sat Feb 16 22:31:42 CET 2008


On 9 feb 2008, at 16.57, Erik van der Poel wrote:

> Regarding tables-04:
>
> F: cp in {00B7, 02B9, 0375, 0483, 05F3, 05F4, 3007, 303B, 30FB}
>
> This is missing 002D and 3005. Maybe you should just remove this line
> and tell readers to look at the table. Then you don't have to remember
> to keep the two in sync.

Now I understand what you mean about "look at the table". You do not  
imply the table in Appendix A.

Well, I understand where you come from, but I think the rule should be  
there as well as the table. They should simply be in sync.

> Also, the draft says:
>
> E: cp is in {002D, 0030..0039, 0061..007A}
>
> However, in the appendix it says:
>
> 0041..005A  ; PVALID

This was a bug in the script that generate the table. Now fixed.

> And the appendix says:
>
> 01D5..01DC  ; DISALLOWED
>
> However, 01D6 has a canonical decomposition (not a compatibility one),
> so toNFKC does not change it, and 01D6 is not changed by toCasefolding
> either.
>
> How would you like to proceed? If you would like to fix your script,
> please send me the new output privately, and I will run another diff.

This was a bug in the "compose" code I wrote. Composing never happend  
correctly if the decomposed string had to be composed in more than one  
step.

Now fixed.

List of all codepoints generated with the fixed script, one on each  
line, can be found at http://stupid.domain.name/idnabis/f.txt

One example of a line is the following:

0040   DISALLOWED  Y#        COMMERCIAL AT

The "Y" before the # can also be an "N". This is an experiment I do  
when checking with IDNA2003. If the Ruby library implementation of  
libidn give as a result that nameprep(cp) == cp, then it is "Y",  
otherwise (including if the result is "not valid"), then it is "N". I  
do not know yet whether this check actually produce the correct result  
regarding whether the codepoint was valid in a U-label in IDNA2003 or  
not. I have to do more tests. Anyone else that have any ideas on what  
one could do?

    Patrik

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20080216/50bcf2fd/PGP.bin


More information about the Idna-update mailing list