KATS (Korean Agency for Technology and Standards)'s Comments on the Unicode Codepoints and IDNA Internet-Draft
Kent Karlsson
kent.karlsson14 at comhem.se
Sun Nov 2 11:58:08 CET 2008
Patrik Faltstrom wrote:
> > Given Kent's note and Ken's warnings about 5.2, wouldn't it be
> > better to do this by a rule disallowing Hangul_Syllable_Type"
> > equal to "L", "V", or "T"? Wouldn't that, in preference to
> > specific ranges, provide protection against future additions and
> > changes?
>
> There are a few different ways of doing this.
>
> First, we can add "Hangul Jamo" to 2.1.4. IgnorableBlocks (D).
[> That would [be] block (in Unicode 5.1) 1100..11FF.]
D: block(cp) in {Combining Diacritical Marks for Symbols,
Musical Symbols, Ancient Greek Musical Notation,
Private Use Area}
1) I *don't* think the Hangul blocks (plural!) belong there. There will
be new blocks for Hangul Jamo, IIUC for 5.2:
A960-A97F; Hangul Jamo Extended-A
D7B0-D7FF; Hangul Jamo Extended-B
2) You have included "E000..F8FF; Private Use Area" in your set above,
but not
F0000..FFFFF; Supplementary Private Use Area-A
100000..10FFFF; Supplementary Private Use Area-B
Why is that? Is "Private Use Area" in the set supposed to cover all
three blocks with that as a *sub*string of the name? I would suggest
not to use a substring approach here.
3) I think
FE00..FE0F; Variation Selectors
E0000..E007F; Tags
E0100..E01EF; Variation Selectors Supplement
should be in the set of IgnorableBlocks as well (though these are
also covered by the IgnorableProperties (C) rule, as well as
not being 2.1.1. LetterDigits (A)). Also
D800..DB7F; High Surrogates
DB80..DBFF; High Private Use Surrogates
DC00..DFFF; Low Surrogates
belong in that set (even though these are excluded by not being
2.1.1. LetterDigits (A)).
Alternatively, include just "Combining Diacritical Marks for Symbols"
in IgnorableBlocks, since all the other things there are excluded
anyway by other rules.
> Secondly, we can use the Hangul_Syllable_Type as defined in
> HangulSyllableType.txt.
I think that would be preferable, since that definition need not
be changed when new Hangul Jamo blocks are added (which is not
entirely unlikely even after 5.2).
>
> What I have not found any Unicode definition of, so it has to be
> exceptions. are for the Bangjeom, 302A..302F. Help would be
> appreciated.
They have compatibility decompositions, and so are excluded already
by the rule in 2.1.2. Unstable (B). I think that is sufficient.
(Those particular compatibility decompositions are, hmm, wrong. But
that is another matter.)
/kent k
> Please advice, specifically those who know how additional characters
> are added to 5.2.
>
> Patrik
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
More information about the Idna-update
mailing list