Unicode 7.0.0, (combining) Hamza Above, and normalization

Whistler, Ken ken.whistler at sap.com
Thu Aug 7 23:47:51 CEST 2014

Paul Hoffmann asserted:

> Right. To me, the current processing under NFC is the wrong result. Andrew
> was a bit polite at the end of his message, but it sounds to me that he thinks
> the NFC processing for the new character leads to the wrong result when
> compared to earlier NFC processing.

The issue for the table update comes down to that.

I think it is quite clear, however, that it is not the case that "the current processing
under NFC is the wrong result".

The premises of this argument all come down to implicit (or
occasionally explicit) assertions that the beh-with-hamza encoded
for the Fula implosive b is the *same* character as an existing
Arabic beh character followed by the combining Hamza mark.

They are *NOT* the same. And *if* they are not the same, all the
arguments about NFC being wrong, etc., are pointless.

These implicit assertions that the beh-with Hamza and the sequence
*ARE* the same are as beside the point as heading down the road
of citing any number of other possible once similarities in appearance:
for example, claiming that U+063A ARABIC LETTER GHAIN is the *SAME*
character as U+0639 ARABIC LETTER AIN + U+0307 COMBINING DOT
ABOVE sequence, because the atomic character and that sequence
might look the same.


More information about the Idna-update mailing list