KATS (Korean Agency for Technology and Standards)'s Comments on the Unicode Codepoints and IDNA Internet-Draft

Kent Karlsson kent.karlsson14 at comhem.se
Sun Nov 2 11:58:08 CET 2008


Patrik Faltstrom wrote:
> > Given Kent's note and Ken's warnings about 5.2, wouldn't it be
> > better to do this by a rule disallowing Hangul_Syllable_Type"
> > equal to "L", "V", or "T"?    Wouldn't that, in preference to
> > specific ranges, provide protection against future additions and
> > changes?
> 
> There are a few different ways of doing this.
> 
> First, we can add "Hangul Jamo" to 2.1.4. IgnorableBlocks (D).
[> That would [be] block (in Unicode 5.1) 1100..11FF.]

   D: block(cp) in {Combining Diacritical Marks for Symbols,
                    Musical Symbols, Ancient Greek Musical Notation,
                    Private Use Area}

1) I *don't* think the Hangul blocks (plural!) belong there. There will
   be new blocks for Hangul Jamo, IIUC for 5.2:
	A960-A97F; Hangul Jamo Extended-A
	D7B0-D7FF; Hangul Jamo Extended-B

2) You have included "E000..F8FF; Private Use Area" in your set above,
   but not 
	F0000..FFFFF; Supplementary Private Use Area-A
	100000..10FFFF; Supplementary Private Use Area-B
   Why is that? Is "Private Use Area" in the set supposed to cover all
   three blocks with that as a *sub*string of the name? I would suggest
   not to use a substring approach here.

3) I think
	FE00..FE0F; Variation Selectors
	E0000..E007F; Tags
	E0100..E01EF; Variation Selectors Supplement
   should be in the set of IgnorableBlocks as well (though these are
   also covered by the IgnorableProperties (C) rule, as well as
   not being  2.1.1.  LetterDigits (A)). Also
	D800..DB7F; High Surrogates
	DB80..DBFF; High Private Use Surrogates
	DC00..DFFF; Low Surrogates
   belong in that set (even though these are excluded by not being
   2.1.1.  LetterDigits (A)).

Alternatively, include just "Combining Diacritical Marks for Symbols"
in IgnorableBlocks, since all the other things there are excluded
anyway by other rules.


> Secondly, we can use the Hangul_Syllable_Type as defined in  
> HangulSyllableType.txt.

I think that would be preferable, since that definition need not
be changed when new Hangul Jamo blocks are added (which is not
entirely unlikely even after 5.2).


> 
> What I have not found any Unicode definition of, so it has to be  
> exceptions. are for the Bangjeom, 302A..302F. Help would be 
> appreciated.

They have compatibility decompositions, and so are excluded already
by the rule in  2.1.2.  Unstable (B). I think that is sufficient.
(Those particular compatibility decompositions are, hmm, wrong. But
that is another matter.)


	/kent k



> Please advice, specifically those who know how additional characters  
> are added to 5.2.
> 
>     Patrik
> 
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list