Katakana Middle Dot again (Was: tables-06b.txt: A.5, A.6, A.9)

Fri Aug 7 13:47:38 CEST 2009

Harald is asking the key question here:

> Is it reasonable to assume that there exists the reasonable desire  
> to register labels that contain IDEOGRAPHIC CLOSING MARK and  
> KATAKANA MIDDLE DOT, but no other Han, Katakana or Kana character?

First, I assume that it is not intended to permit Katakana Middle Dot  
without the presence of at least one other Han, Katakana or Hiragana  
character

Second, I assume the same would be true of the Ideographic Closing Mark

Third, I have not been able to understand the utility of allowing a  
label consisting only of IDEOGRAPHIC CLOSING MARK and KATAKANA MIDDLE  
DOT.

So I am as puzzled as Harald on this question.

vint

On Aug 7, 2009, at 7:29 AM, Harald Tveit Alvestrand wrote:

> Yoshiro YONEYA skrev:
>> Dear John,
>>
>>
>>> U+3005 and U+3007 are identified as "Han" in the Unicode table
>>> Scripts.txt, so need no special treatment.
>>>
>>
>> Exactly!
>>
>>
>>> U+3006 (IDEOGRAPHIC CLOSING MARK) is listed as in "Common"
>>> script in that table.   Without understanding the use of this
>>> character, it is plausible that it would occur in a label that
>>> consisted only of it, the middle dot, and, e.g., Romanji? If it
>>> is not going to be used except when other ideographic characters
>>> are present, there is no need to make an exception, although a
>>> comment might be in order.  Remember that, as you suggested, the
>>> test now requires only a single character that is unambiguously
>>> Hiragana, Katakana, or Han.
>>>
>>
>> U+3006 (IDEOGRAPHIC CLOSING MARK) is some kind of simplified form  
>> of U+7DE0.  U+7DE0 is sometimes substituted by U+3006 when it is  
>> used for meaning closing, therefore treatment of U+3006 is the same  
>> with Han.
> I reiterate the question:
>
> Is it reasonable to assume that there exists the reasonable desire  
> to register labels that contain IDEOGRAPHIC CLOSING MARK and  
> KATAKANA MIDDLE DOT, but no other Han, Katakana or Kana character?
>
> Again, we are seeking a justification for overriding an Unicode  
> determination - I don't understand the reason for the determination  
> that placed U+7DE0 in script "Han" but U+3006 in script "Common",  
> but generally, we have tried to reduce the number of special  
> exceptions to the rules determined by looking at Unicode properties  
> as much as possible.
>
>      Harald
>