Katakana Middle Dot again (Was: tables-06b.txt: A.5, A.6, A.9)
Harald Tveit Alvestrand
harald at alvestrand.no
Fri Aug 7 13:29:10 CEST 2009
Yoshiro YONEYA skrev:
> Dear John,
>
>
>> U+3005 and U+3007 are identified as "Han" in the Unicode table
>> Scripts.txt, so need no special treatment.
>>
>
> Exactly!
>
>
>> U+3006 (IDEOGRAPHIC CLOSING MARK) is listed as in "Common"
>> script in that table. Without understanding the use of this
>> character, it is plausible that it would occur in a label that
>> consisted only of it, the middle dot, and, e.g., Romanji? If it
>> is not going to be used except when other ideographic characters
>> are present, there is no need to make an exception, although a
>> comment might be in order. Remember that, as you suggested, the
>> test now requires only a single character that is unambiguously
>> Hiragana, Katakana, or Han.
>>
>
> U+3006 (IDEOGRAPHIC CLOSING MARK) is some kind of simplified form
> of U+7DE0. U+7DE0 is sometimes substituted by U+3006 when it is
> used for meaning closing, therefore treatment of U+3006 is the same
> with Han.
I reiterate the question:
Is it reasonable to assume that there exists the reasonable desire to
register labels that contain IDEOGRAPHIC CLOSING MARK and KATAKANA
MIDDLE DOT, but no other Han, Katakana or Kana character?
Again, we are seeking a justification for overriding an Unicode
determination - I don't understand the reason for the determination that
placed U+7DE0 in script "Han" but U+3006 in script "Common", but
generally, we have tried to reduce the number of special exceptions to
the rules determined by looking at Unicode properties as much as possible.
Harald
More information about the Idna-update
mailing list