[Json] Json and U+08A1 and related cases (was: Re: Barry Leiba's Discuss on draft-ietf-json-i-json-05: (with DISCUSS and COMMENT))

Nico Williams nico at cryptonector.com
Wed Jan 21 20:42:54 CET 2015


On Wed, Jan 21, 2015 at 02:29:22PM -0500, John C Klensin wrote:
> --On Wednesday, January 21, 2015 10:22 -0600 Nico Williams
> <nico at cryptonector.com> wrote:
> > I thought that NFC was closed to new precompositions though new
> > precompositions might be added to Unicode.  That is, the NFC
> > form of U+08A1 must be the same as the NFD form of U+08A1,
> > which is to say: U+0628 U+0654.
> > 
> > Is my memory wrong about that?
> 
> That is the understanding that several -- I dare to say most or
> all of  the IDNAbis WG participants -- of us had.  What has
> actually occurred either violates that assumption or introduces
> an extra case, depending on how one looks at the problem.   [...]

Because...

> But, while U+08A1 is abstract-character-identical and even
> plausible-name-identical to U+0628 U+0654, it does _not_
> decompose into the latter.  Instead, NFD(U+08A1) = NFC(U+08A1) =

...this is a desirable property of that particular character, or because 
the UC screwed up?  See below.

> U+08A1.  NFC (U+0628 U+0654) is U+0628 U+0654 as one would
> expect from the stability rules; from that perspective, it is
> the failure of U+08A1 to have a (non-identity) decomposition
> that is the issue.

Is it identical, as rendered as well as semantically, to U+0628 U+0654?

If U+08A1 identical to U+0628 U+0654 in every way then I think the UC
erred.  If it is not, then U+08A1 strikes me as a new case that IDNA
should treat as though NFC(U+08A1) == U+0628 U+0654 (because what else
could IDNA reasonably do??).  In what ways is U+08A1 not identical to
U+0628 U+0654? (besides, of course, being a different codepoint
sequence)

Nico
-- 


More information about the Idna-update mailing list