IDN uses unicode because...

Sam Vilain sam.vilain at catalyst.net.nz
Wed Jul 9 05:08:50 CEST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kenneth Whistler wrote:
>> The CJK screw-up (where for
>> the CJK range codepoints were arbitrarily assigned to units of meaning
>> rather than actual characters) springs to mind!
> I don't know where *that* assertion came from. Whatever
> its origin, it is complete misinformation about CJK characters
> in Unicode.
> Can you cite certain sources for it?

Well, http://en.wikipedia.org/wiki/Unicode#Issues for a start :-)

Doing a bit more research it seems my statement is both dated (see for
instance http://www.personal.psu.edu/ejp10/psu/gotunicode/japanese.html)
and quite inaccurate (according to
http://www.unicode.org/versions/Unicode5.0.0/ch12.pdf, the semantic
component is only one of three primary attributes used when deciding to
unify a character - tables 12-5 and 12-6 I found particularly helpful on
this front).  It was largely hearsay I must confess.

Anyway, this is a bit off-topic, and

> "IDNA uses the Unicode character repertoire, for continuity
> with IDNA2003."

avoids the issue entirely.
- --
Sam Vilain, Chief Yak Shaver, Catalyst IT (NZ) Ltd.
phone: +64 4 499 2267        PGP ID: 0x66B25843
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEUEARECAAYFAkh0K8EACgkQ/AZAiGayWEO0jACgoMOyFNdcEpJTHjhMZsjZGczZ
wgsAl2GxvHLdC8DNWx579kjIMelu4CA=
=x12O
-----END PGP SIGNATURE-----


More information about the Idna-update mailing list