[Json] Json and U+08A1 and related cases (was: Re: Barry Leiba's Discuss on draft-ietf-json-i-json-05: (with DISCUSS and COMMENT))
Nico Williams
nico at cryptonector.com
Wed Jan 21 20:42:54 CET 2015
On Wed, Jan 21, 2015 at 02:29:22PM -0500, John C Klensin wrote:
> --On Wednesday, January 21, 2015 10:22 -0600 Nico Williams
> <nico at cryptonector.com> wrote:
> > I thought that NFC was closed to new precompositions though new
> > precompositions might be added to Unicode. That is, the NFC
> > form of U+08A1 must be the same as the NFD form of U+08A1,
> > which is to say: U+0628 U+0654.
> >
> > Is my memory wrong about that?
>
> That is the understanding that several -- I dare to say most or
> all of the IDNAbis WG participants -- of us had. What has
> actually occurred either violates that assumption or introduces
> an extra case, depending on how one looks at the problem. [...]
Because...
> But, while U+08A1 is abstract-character-identical and even
> plausible-name-identical to U+0628 U+0654, it does _not_
> decompose into the latter. Instead, NFD(U+08A1) = NFC(U+08A1) =
...this is a desirable property of that particular character, or because
the UC screwed up? See below.
> U+08A1. NFC (U+0628 U+0654) is U+0628 U+0654 as one would
> expect from the stability rules; from that perspective, it is
> the failure of U+08A1 to have a (non-identity) decomposition
> that is the issue.
Is it identical, as rendered as well as semantically, to U+0628 U+0654?
If U+08A1 identical to U+0628 U+0654 in every way then I think the UC
erred. If it is not, then U+08A1 strikes me as a new case that IDNA
should treat as though NFC(U+08A1) == U+0628 U+0654 (because what else
could IDNA reasonably do??). In what ways is U+08A1 not identical to
U+0628 U+0654? (besides, of course, being a different codepoint
sequence)
Nico
--
More information about the Idna-update
mailing list