What rules have been used for the current list of codepoints?

Patrik Fältström patrik at frobbit.se
Wed Dec 13 17:12:02 CET 2006


I understand there is confusing what rules have been used TODAY for  
the list of codepoints.

These are the rules, the first that matches tell whether the  
codepoint is ok to include or not.

1. If block is "IPA Extensions", the codepoint is not ok
2. If the script is "Inherited", the codepoint is not ok
3. If the codepoint is [A-Z], the codepoint is ok
4. If the codepoint is [0-9], the codepoint is ok
5. If NFKC(cp) != cp, the codepoint is not ok
6. If lowercase(cp) != cp, the codepoint is not ok
7. If class is [Ll, Lo, Mn, Mc], the codepoint is ok

I have a suggestion that rule 7 should also include classes Lm and  
Nd, but I have not included that.

Do I see a consensus on this list that I should also include Lm and  
Nd? (Then rule 4 can be removed.)

I also have a suggestion that rule 2 above should be removed, that I  
went one step too far in conclusions from earlier discussions.

Do I see a consensus on this list that I should remove rule 2?

BTW, the URL to the latest document is http://stupid.domain.name/ 
idnabis/table-latest.html.

Other changes you will see is:

(a) The list of rules (that you see above) will be included in the  
document
(b) The scripts will be in english alphabetical order

     Patrik
  


More information about the Idna-update mailing list