Q2: What mapping function should be used in a revised IDNA2008 specification?

Wed Apr 1 09:38:39 CEST 2009

FWIW, I pretty much agree with Martin on this, but would prefer
that we try to restrict (1) to the simple LowerCase operation,
rather than using CaseFold for mapping.  When the latter
produces results different from LowerCase, some of them seem to
astonish non-expert users.

I also believe that compatibility mappings identified with
<font> are also clear candidates for exclusion.

    john

--On Wednesday, April 01, 2009 16:29 +0900 "\"Martin J.
Dürst\"" <duerst at it.aoyama.ac.jp> wrote:

> My preference would be to use a significantly more restricted
> set of  mappings than for IDNA2003. At the very highest level,
> the IDNA2003 mappings contained:
> 1) Case mappings
> 2) NFC mappings (canonical equivalence)
> 3) NFKC mappings (compatibility equivalence)
> 
> I think the best thing would be to retain 1) and 2), but only
> a very small part of 3). The reason for this is that 1) is
> used as a parallel to the ASCII case equivalence in the ASCII
> DNS, 2) is an inherent representational issue of an encoding
> that (like Unicode) provides composing of accents and the
> like, but 3) is a hodgepodge collection of various kinds of
> equivalences.
> 
> Indeed, in the Unicode data file, canonical equivalences are
> marked with various different tags such as <super>,
> <fraction>, <final>, <medial>, <vertical>, <small>, <wide>,
> <narrow>, and so on.
> 
> I haven't done a full analysis, but I think we need to keep
> <wide> and <narrow> because of how East Asia IMEs work,
> but we should definitely get rid of <super>, <fraction> (which
> can produce slashes), and so on, because they really, really
> don't make sense in a domain name context.