Normalization of Hangul

Wed Feb 20 23:20:12 CET 2008

On 20 feb 2008, at 11.16, Martin Duerst wrote:

>> Part from this, myself and Erik now have (almost) interoperable
>> implementations of the specification in the -tables document.
>
> Can you tell us where there is a difference, and who uses the
> Java code?

The difference is just one codepoint, and I think it is I that do some  
weird stuff for it.

For Hangul I do not use the Java code as I do not know Java. I have  
implementations in Perl and Ruby for the algorithms, where the Ruby  
implementation implements the full NFKC, casefolding etc for  
everything except Hangul. There I rely on what is in the perl  
libraries. The perl implementation I have use the perl libraries (I  
have not written the NFKC part).

Erik has sent me a description of the Hangul algorithm separately, so  
I will now implement that aswell. Including an implementation in C  
that I do know.

I do though in the ruby implementation now use the derived property  
value Default_Ignorable_Codepoint, and not the base properties. I  
would like to go back to use the original properties -- at least for  
the implementation -- so I can really say I have implemented things  
from the base properties of the Unicode Standard when I am asked  
whether there are interoperable implementations.

  Patrik