Katakana Middle Dot again (Was: tables-06b.txt: A.5, A.6, A.9)
Vint Cerf
vint at google.com
Fri Aug 7 13:47:38 CEST 2009
Harald is asking the key question here:
> Is it reasonable to assume that there exists the reasonable desire
> to register labels that contain IDEOGRAPHIC CLOSING MARK and
> KATAKANA MIDDLE DOT, but no other Han, Katakana or Kana character?
First, I assume that it is not intended to permit Katakana Middle Dot
without the presence of at least one other Han, Katakana or Hiragana
character
Second, I assume the same would be true of the Ideographic Closing Mark
Third, I have not been able to understand the utility of allowing a
label consisting only of IDEOGRAPHIC CLOSING MARK and KATAKANA MIDDLE
DOT.
So I am as puzzled as Harald on this question.
vint
On Aug 7, 2009, at 7:29 AM, Harald Tveit Alvestrand wrote:
> Yoshiro YONEYA skrev:
>> Dear John,
>>
>>
>>> U+3005 and U+3007 are identified as "Han" in the Unicode table
>>> Scripts.txt, so need no special treatment.
>>>
>>
>> Exactly!
>>
>>
>>> U+3006 (IDEOGRAPHIC CLOSING MARK) is listed as in "Common"
>>> script in that table. Without understanding the use of this
>>> character, it is plausible that it would occur in a label that
>>> consisted only of it, the middle dot, and, e.g., Romanji? If it
>>> is not going to be used except when other ideographic characters
>>> are present, there is no need to make an exception, although a
>>> comment might be in order. Remember that, as you suggested, the
>>> test now requires only a single character that is unambiguously
>>> Hiragana, Katakana, or Han.
>>>
>>
>> U+3006 (IDEOGRAPHIC CLOSING MARK) is some kind of simplified form
>> of U+7DE0. U+7DE0 is sometimes substituted by U+3006 when it is
>> used for meaning closing, therefore treatment of U+3006 is the same
>> with Han.
> I reiterate the question:
>
> Is it reasonable to assume that there exists the reasonable desire
> to register labels that contain IDEOGRAPHIC CLOSING MARK and
> KATAKANA MIDDLE DOT, but no other Han, Katakana or Kana character?
>
> Again, we are seeking a justification for overriding an Unicode
> determination - I don't understand the reason for the determination
> that placed U+7DE0 in script "Han" but U+3006 in script "Common",
> but generally, we have tried to reduce the number of special
> exceptions to the rules determined by looking at Unicode properties
> as much as possible.
>
> Harald
>
More information about the Idna-update
mailing list