prohibiting previously mapped and unmapped characters

Martin Duerst duerst at it.aoyama.ac.jp
Fri Dec 29 11:28:39 CET 2006


At 02:06 06/12/01, Yoshiro YONEYA wrote:
>On Wed, 29 Nov 2006 09:42:11 -0800 "Erik van der Poel" <erikv at google.com> wrote:

>There will be some sort of confusion for interpreting idnabis-tables 
>document.  I feel some classes marked as "No" should be investigated 
>deeply.  As pointed by Erik, for Japanese, compatible characters 
>(full-width/half-width) in range of U+FF01..5E and U+FF65..9F should 
>be noted as "available in input",

I think that should be a matter of UI implementation. Good implementations
will most probably provide this for end-user input, especially in
mixed-label and mixed-name situations. But thinking again about the
examples that Erik and Mark brought up from Google, it's really
strange to see ASCII-only names written with accidental full-width
characters, and we should try hard to make sure such cases don't
spread (e.g. in HTML documents,...). The reason why is that there
are still large parts of the infrastructure that don't deal with
IDNs (the easiest ways to describe these parts of the infrastructure
is to say "everything except a few well-known browsers).

>and CJK symbols in range of U+3005..07 
>should be noted as "available as IDNs".

Definitely for U+3005 (a 'repetition' character sometimes used in
personal names, no potential for confusion) and U+3007
(the 'Kanji' version of the numeral zero). I'm not very
sure about U+3006 (a special sign for 'closed', 'finished',
'deadline',...), but Yoshiro may have some good examples where
it's used in names.

Regards,    Martin.



#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     



More information about the Idna-update mailing list