Q2: What mapping function should be used in a revised IDNA2008 specification?

Wed Apr 1 19:38:19 CEST 2009

At 4:29 PM +0900 4/1/09, Martin J. Dürst wrote:
>My preference would be to use a significantly more restricted set of
>mappings than for IDNA2003. At the very highest level, the IDNA2003
>mappings contained:
>1) Case mappings
>2) NFC mappings (canonical equivalence)
>3) NFKC mappings (compatibility equivalence)
>
>I think the best thing would be to retain 1) and 2), but only a very
>small part of 3). The reason for this is that 1) is used as a parallel
>to the ASCII case equivalence in the ASCII DNS, 2) is an inherent
>representational issue of an encoding that (like Unicode) provides
>composing of accents and the like, but 3) is a hodgepodge collection
>of various kinds of equivalences.

I agree with this preference in principle, but as a practical matter, we are better off saying "use the full NFKC of the version of Unicode currently in use" rather than "use this often-changing case table or function, and use this often-changing canonical table or function, and use this often-changing compatibility table or function". The danger of saying "TUC defines NFKC for each version; use it" is approximately the same as "TUC updates TUS and we think that won't cause us to have to revise this RFC".

At 1:41 AM -0400 4/1/09, John C Klensin wrote:
>Under no circumstances should mapping be used as a mechanism for
>undoing those decisions, i.e., mapping should be permitted only
>when the result is a PVALID character.

If you meant "PVALID, CONTEXTJ, or CONTEXT0", I agree.