Perl Unicode libraries (was: Re: Casefolding Sigma)
patrik at frobbit.se
Wed Jan 30 22:38:59 CET 2008
On 29 jan 2008, at 21.48, Kenneth Whistler wrote:
> I don't personally know of any problems with Unicode::Normalize
> or Unicode::UCD in Perl, but if you use those library modules,
> you need to be aware of the version-dependence and its
> relation to Unicode versions. See, for example:
Correct. I have updated the stuff so things are really 5.0.0 even in
my older version of perl.
The problem in perl is unfortunately that the representation of
unicode codepoint is weird. Perl try to "guess" what encoding a string
is in, and that at every calculation that is made. Just like the
automatic casting that always happens. For example, it guesses whether
UTF-8 or UTF-16 is used for the encoding of a string. Forcing one
thing or another is not easy.
So, unfortunately, perl is easy to use, but unprecise.
I think I have most things under control though.
But, thanks for the pointer Ken!
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20080130/2d955836/PGP.bin
More information about the Idna-update