New version, draft-faltstrom-idnabis-tables-02.txt, available

Harald Alvestrand harald at alvestrand.no
Tue Jun 19 14:04:31 CEST 2007


Martin Duerst wrote:
> Okay, let's give it one try.
>
> The CJK Unified Ideographs block is stable. It has been there
> since the beginning of Unicode, and is based on the unification
> of well-used national standards (what's available in terms of
> ideographs on an everyday PC (nowadays also mobile phone) in
> China, Japan, or Korea). I at least do not know about any
> kind of bug in this range.
>
> There are quite a number of confusables (both prime-faice visibly
> as well as semantically) due to various simplifications and the
> large number of symbols as such. However, these can only be handled
> by subsetting (e.g. in a national context) or bundling (e.g.
> using RFC 3743). It is impossible to a-priori say that some of
> them are not allowed, so all of them should go into the ALLOWED
> category.
>
> Please note that this doesn't deal with additional blocks of
> ideographs (Extension A, Extension B, ..., Compatibility),
> because I cannot make the same statements for these as above.
>
> Is a statement like the above what you are looking for?
>   
That's exactly the kind of statement I'm looking for, at least.

Checking: By "CJK Unified Ideographs", you mean the range 4E00..9FFF, as 
described in Blocks.txt from the Unicode database version 5.0.0.

 From my reading of the "Scripts.txt" file, these have the Script 
property of "Han".

You are not willing to speak for the following ranges also in the script 
"Han":

2E80..2E99    ; Han # So  [26] CJK RADICAL REPEAT..CJK RADICAL RAP
2E9B..2EF3    ; Han # So  [89] CJK RADICAL CHOKE..CJK RADICAL 
C-SIMPLIFIED TURTLE
2F00..2FD5    ; Han # So [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE
3005          ; Han # Lm       IDEOGRAPHIC ITERATION MARK
3007          ; Han # Nl       IDEOGRAPHIC NUMBER ZERO
3021..3029    ; Han # Nl   [9] HANGZHOU NUMERAL ONE..HANGZHOU NUMERAL NINE
3038..303A    ; Han # Nl   [3] HANGZHOU NUMERAL TEN..HANGZHOU NUMERAL THIRTY
303B          ; Han # Lm       VERTICAL IDEOGRAPHIC ITERATION MARK
3400..4DB5    ; Han # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED 
IDEOGRAPH-4DB5
F900..FA2D    ; Han # Lo [302] CJK COMPATIBILITY IDEOGRAPH-F900..CJK 
COMPATIBILITY IDEOGRAPH-FA2D
FA30..FA6A    ; Han # Lo  [59] CJK COMPATIBILITY IDEOGRAPH-FA30..CJK 
COMPATIBILITY IDEOGRAPH-FA6A
FA70..FAD9    ; Han # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK 
COMPATIBILITY IDEOGRAPH-FAD9
20000..2A6D6  ; Han # Lo [42711] CJK UNIFIED IDEOGRAPH-20000..CJK 
UNIFIED IDEOGRAPH-2A6D6
2F800..2FA1D  ; Han # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK 
COMPATIBILITY IDEOGRAPH-2FA1D

(some of these may be eliminated by other rules in the current draft.)
Correct?

                            Harald



More information about the Idna-update mailing list