Tables and contextual rule for Katakana middle dot

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Wed Apr 8 06:43:45 CEST 2009


First, I'm sympathetic to the fact that punctuation should be excluded
in general. But I think it's not that easy. There is a continuum between
characters such as the "." (a period in punctuation, not really 
necessary in a label), "'" (an apostrophe in some contexts, part
of many English words, for which we got accustomed to not have
it in domain names, but for which I guess there might be quite
a few people rooting if domain names for English weren't a done 
business), and characters closer to letters than the English apostrophe.

Second, on the visual confusability of ・(middle dot), I'm personally
not too worried. And I think visual confusability in handwriting
the way John has described it isn't really what we should check for.
I'm sure everybody would read Latin characters with dot-like stuff
between them as dots in first approximation, and only with middle dot
if the handwriting was very careful. That a Latin-oriented OCR software
gets things wrong isn't surprising either, they get a lot of things 
wrong, and they definitely can't recognize characters they are not 
programmed for.

Also, I'm not worried about Japanese being "famous for artistic 
calligraphy and font design". There are a lot of fancy fonts for 
Japanese, but way less than for European scripts (the large number of 
characters just makes it much more expensive to create a font), and 
exactly the same way as for European scripts, such fonts are not 
customarily used when displaying domain names. And even in these fonts, 
there is not too much artistry going on with dots and middle dots.

I have written to a group of Japanese typography experts, authors of 
http://www.w3.org/TR/2008/WD-jlreq-20081015/ and many of them also 
having been involved in JIS 4051, and asked them for feedback from a 
typographic view on the context of middle dot. I'll relay whatever I get 
from them here.

If I had to decide now, I would conclude that the middle dot can be 
allowed in the protocol, and that only registries such as the Japanese 
one that thinks it's needed for their users should allow it. But I would 
also be okay with permitting the middle dot only in contexts where there 
is a Kanji, Hiragana, or Katakana at least on one side. In my eye, 
having middle dots between Latin characters simply happens in practice 
because the middle dot is available, but can easily be replaced by the 
hyphen, which is typographically more appropriate for Latin.

Regards,    Martin.

On 2009/04/07 22:35, Harald Alvestrand wrote:
> Yoshiro YONEYA wrote:
>> Dear Patrik-san,
>>
>> Japanese uses Hiragana, Katakana, Han, Alphabet letters (a-z), and
>> digit (0-9) for names.  KATAKANA MIDDLEDOT is usually used with those
>> names, so the following kind of case is really exists and used:
>>
>>      Play<KATAKANA MIDDLEDOT>Station<KATAKANA MIDDLEDOT>4.jp
>>
>> That is the reason why I said "Japanese context".
>>
>> To be precise, Japanese scripts (for IDN) are consists from:
>>
>>      Hiragana, Katakana, Han, Alphabet, Digit,
>>      IDEOGRAPHIC CLOSING MARK, IDEOGRAPHIC NUMBER ZERO,
>>      KATAKANA MIDDLEDOT and IDEOGRAPHIC ITERATION MARK
>>
>> Extracting Alphabet and Digit from the list is unacceptable.
>>
>> I'll try to express this ambiguous situation more clearly.
>>
>> Regards,
>>
>>
> Speaking with sadness:
>
> If this is the case, I think we will have to declare KATAKANA MIDDLE DOT
> to have the same status as the apostrophe: Not permitted.
>
>                 Harald
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp


More information about the Idna-update mailing list