What rules have been used for the current list of codepoints?
Michael Everson
everson at evertype.com
Thu Dec 14 22:56:00 CET 2006
At 19:58 +0100 2006-12-14, Patrik Fältström wrote:
>On 14 dec 2006, at 11.10, Michael Everson wrote:
>
>>, but *I* am absolutely sure that you cannot
>>exclude characters from this block by excluding
>>the block. This will deny IDN to millions of
>>people.
>>
>>Is that clear enough?
>
>This is exactly my point.
>
>Just because some of the codepoints are really
>really really important, we have to include the
>whole set of codepoints of that
>{script,block,class,whatever}.
I never said that. I say there are characters we
know we need. You lot are trying to do this
algorithimically, by assuming that the content of
certain *blocks* in the UCS is anything other
than accidental. Certainly for Latin that is not
the case.
So the mistake is to try to do what you are
doing, to base things on UCS blocks.
>Some other people on this list say that when
>selecting that whole set, we will get also some
>codepoints "for free" that we do not want.
I said you MAY NOT exclude the IPA block. I did
not say that you MUST include everything in it.
Indeed some of the other blocks for which you
include everything might well benefit from
weeding.
>That is for me evidence that the selectors that
>we discuss (and should continue to discuss
>obviously) are not good enough.
The UCS Block is not a good selector. I believe
this has been said before, and for the same
reasons: schwa, ezh, and a dozen African letters.
We should begin with script property, because we
need to restrict the mixing of certain (most)
scripts within labels. Are you able to use that
property to distinguish between characters?
--
Michael Everson * http://www.evertype.com
More information about the Idna-update
mailing list