Unicode 7.0.0, (combining) Hamza Above, and normalization

Andrew Sullivan ajs at anvilwalrusden.com
Mon Aug 11 00:35:15 CEST 2014

On Sun, Aug 10, 2014 at 01:55:41PM -0700, Asmus Freytag wrote:

> The case of 3 and 4 has been in IDNA from the beginning and affects
> one of the more computer-literate communities (Western Scandinavia).
> It's not, apparently been something that has led to massive issues,
> otherwise it would be a well known case.

You can't be serious.  U+08A1 is a code point being added _now_.  The
various ways of encoding characters in Western Scandinavia were
already well established and settled by the time IDNA was even an

Indeed, this temporal difference is exactly why some of us, at least,
are uneasy.  A significant chunk of the reason to do IDNA2008 was that
IDNA2003 wasn't Unicode agile, so when you got a new library on your
OS IDNA was at least nominally broken; certainly you had undefined
behaviour.  We thought this would work because of a misapprehension
about normalization rules.

Best regards,


Andrew Sullivan
ajs at anvilwalrusden.com

More information about the Idna-update mailing list