Q2: What mapping function should be used in a revised IDNA2008 specification?

Erik van der Poel erikv at google.com
Fri Apr 3 18:29:35 CEST 2009

Hello Martin,

On Thu, Apr 2, 2009 at 11:40 PM, "Martin J. Dürst"
<duerst at it.aoyama.ac.jp> wrote:
> On 2009/04/03 2:50, Erik van der Poel wrote:
>> <initial>       An initial presentation form (Arabic).
>> <medial>        A medial presentation form (Arabic).
>> <final>         A final presentation form (Arabic).
>> <isolated>      An isolated presentation form (Arabic).
> Out, I'd say, but it would be better to get input on this
> from the Arabic IDN experts.

I agree. In particular, are there input methods that directly lead to these?

>> <vertical>      A vertical layout presentation form.
> Unclear. Out unless they happen to be introduced by IMEs
> (my guess is that these are mostly used for glyph identifiers
> in fonts, but that's just a guess)

It would be good to get confirmation, but my guess is that even if
there is a special input method mode for these, people are less likely
to accidentally type them.

>> <compat>        Otherwise unspecified compatibility character.
> That's probably the category that we will have to look at most closely.

Yes, many of these are likely to be "out", but some of them might be "in".

I could take a look at domain names in HTML on the Web, to see which
NFKC characters are used, and how often. But we should be careful not
to draw conclusions solely from Web data.


