What rules have been used for the current list of codepoints?
Kenneth Whistler
kenw at sybase.com
Wed Dec 13 17:45:01 CET 2006
Patrick,
> I understand there is confusing what rules have been used TODAY for
> the list of codepoints.
>
> These are the rules, the first that matches tell whether the
> codepoint is ok to include or not.
>
> 1. If block is "IPA Extensions", the codepoint is not ok
> 2. If the script is "Inherited", the codepoint is not ok
> 3. If the codepoint is [A-Z], the codepoint is ok
> 4. If the codepoint is [0-9], the codepoint is ok
> 5. If NFKC(cp) != cp, the codepoint is not ok
> 6. If lowercase(cp) != cp, the codepoint is not ok
> 7. If class is [Ll, Lo, Mn, Mc], the codepoint is ok
>
> I have a suggestion that rule 7 should also include classes Lm and
> Nd, but I have not included that.
>
> Do I see a consensus on this list that I should also include Lm and
> Nd? (Then rule 4 can be removed.)
I have no idea how consensus on this list is measured, but
*I* am absolutely sure that Lm and Nd need to be added. In
fact, using the formulation you are using here for rules,
the whole list of rules should be reconstructed as:
1. If class is [Ll, Lm, Lo, Mn, Mc, Nd], the code point is ok
2. If NFKC(cp) != cp, the code point is not ok
3. If lowercase(cp) != cp, the code point is not ok
And that is pretty much exactly what I stated in the November 30
contribution.
And the results from that are posted at:
http://www.unicode.org/~whistler/SPLlLoLmMnMcNdStableCaseNFKC.txt
>
> I also have a suggestion that rule 2 above should be removed, that I
> went one step too far in conclusions from earlier discussions.
>
> Do I see a consensus on this list that I should remove rule 2?
Again I do not understand how you judge consensus, but stated
that way, yes, remove rule 2.
Furthermore, clearly you need to remove your rule 1.
--Ken
>
> BTW, the URL to the latest document is http://stupid.domain.name/
> idnabis/table-latest.html.
>
> Other changes you will see is:
>
> (a) The list of rules (that you see above) will be included in the
> document
> (b) The scripts will be in english alphabetical order
>
> Patrik
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
More information about the Idna-update
mailing list