Q2: What mapping function should be used in a revised IDNA2008 specification?

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Wed Apr 1 09:29:56 CEST 2009


My preference would be to use a significantly more restricted set of 
mappings than for IDNA2003. At the very highest level, the IDNA2003
mappings contained:
1) Case mappings
2) NFC mappings (canonical equivalence)
3) NFKC mappings (compatibility equivalence)

I think the best thing would be to retain 1) and 2), but only a very
small part of 3). The reason for this is that 1) is used as a parallel
to the ASCII case equivalence in the ASCII DNS, 2) is an inherent
representational issue of an encoding that (like Unicode) provides
composing of accents and the like, but 3) is a hodgepodge collection
of various kinds of equivalences.

Indeed, in the Unicode data file, canonical equivalences are marked
with various different tags such as <super>, <fraction>, <final>,
<medial>, <vertical>, <small>, <wide>, <narrow>, and so on.

I haven't done a full analysis, but I think we need to keep
<wide> and <narrow> because of how East Asia IMEs work,
but we should definitely get rid of <super>, <fraction> (which
can produce slashes), and so on, because they really, really
don't make sense in a domain name context.

Regards,     Martin.

On 2009/04/01 1:07, Vint Cerf wrote:
> What characters should be mapped into what other characters in a
> revised IDNA2008 specification?
>
> Can we describe succinctly and precisely what these mappings are? How?
> What should they be?
>
>
>
> Vint Cerf
> Google
> 1818 Library Street, Suite 400
> Reston, VA 20190
> 202-370-5637
> vint at google.com
>
>
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>

-- 
#-# Martin J.Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp


More information about the Idna-update mailing list