KATS (Korean Agency for Technology and Standards)'s Comments on the Unicode Codepoints and IDNA Internet-Draft
Patrik Fältström
patrik at frobbit.se
Mon Nov 3 14:27:38 CET 2008
On 2 nov 2008, at 12.58, Kent Karlsson wrote:
> D: block(cp) in {Combining Diacritical Marks for Symbols,
> Musical Symbols, Ancient Greek Musical Notation,
> Private Use Area}
>
> 1) I *don't* think the Hangul blocks (plural!) belong there. There
> will
> be new blocks for Hangul Jamo, IIUC for 5.2:
> A960-A97F; Hangul Jamo Extended-A
> D7B0-D7FF; Hangul Jamo Extended-B
I am now for the next version of the document NOT choosing this path.
> 2) You have included "E000..F8FF; Private Use Area" in your set above,
> but not
> F0000..FFFFF; Supplementary Private Use Area-A
> 100000..10FFFF; Supplementary Private Use Area-B
> Why is that? Is "Private Use Area" in the set supposed to cover all
> three blocks with that as a *sub*string of the name? I would suggest
> not to use a substring approach here.
That is a bug in the draft. In reality Private Use Area (and some
others I am sure) will be DISALLOWED because they are assigned
codepoints but does not belong to any rule that result in a PVALID
result.
> 3) I think
> FE00..FE0F; Variation Selectors
FE00..FE19 ; DISALLOWED # VARIATION SELECTOR-1..PRESENTATION FORM FOR
> E0000..E007F; Tags
> E0100..E01EF; Variation Selectors Supplement
> should be in the set of IgnorableBlocks as well (though these are
> also covered by the IgnorableProperties (C) rule, as well as
> not being 2.1.1. LetterDigits (A)). Also
E0000 ; UNASSIGNED # <reserved>
E0001 ; DISALLOWED # LANGUAGE TAG
E0002..E001F; UNASSIGNED # <reserved>..<reserved>
E0020..E007F; DISALLOWED # TAG SPACE..CANCEL TAG
E0080..E00FF; UNASSIGNED # <reserved>..<reserved>
E0100..E01EF; DISALLOWED # VARIATION SELECTOR-17..VARIATION SELECTOR-25
E01F0..EFFFD; UNASSIGNED # <reserved>..<reserved>
EFFFE..10FFFE; DISALLOWED # <noncharacter>..<noncharacter>
> D800..DB7F; High Surrogates
> DB80..DBFF; High Private Use Surrogates
> DC00..DFFF; Low Surrogates
> belong in that set (even though these are excluded by not being
> 2.1.1. LetterDigits (A)).
D800..FA0D ; DISALLOWED # <Non Private Use High Surrogate>..CJK COMPAT
> Alternatively, include just "Combining Diacritical Marks for Symbols"
> in IgnorableBlocks, since all the other things there are excluded
> anyway by other rules.
>
>> Secondly, we can use the Hangul_Syllable_Type as defined in
>> HangulSyllableType.txt.
>
> I think that would be preferable, since that definition need not
> be changed when new Hangul Jamo blocks are added (which is not
> entirely unlikely even after 5.2).
I am now adding a rule that DISALLOW codepoints with Hangul Syllable
Type is one of L, V or T.
Is that correct understanding of the situation?
>> What I have not found any Unicode definition of, so it has to be
>> exceptions. are for the Bangjeom, 302A..302F. Help would be
>> appreciated.
>
> They have compatibility decompositions, and so are excluded already
> by the rule in 2.1.2. Unstable (B). I think that is sufficient.
Hmm...they do not match Unstable in my program. And because of that I
have to add them as exceptions.
Can you please check again?
Patrik
More information about the Idna-update
mailing list