Lookup & NFC

Martin Duerst duerst at it.aoyama.ac.jp
Fri Mar 28 06:22:01 CET 2008


At 03:10 08/03/28, John C Klensin wrote:

>The more important answer is that the intent of the spec is "if
>you need this mapping, it is your job to apply it before you
>invoke IDNA".  Taking NFC as an example, let's assume we have
>two operating systems,
>
>       * One of them gets strings into NFC form as soon as they
>       are typed and verifies that (and corrects them if
>       necessary) any time they are loaded or otherwise
>       examined.
>       
>       * The other lets users type strings and carried them
>       around in whatever form they are typed, presumably
>       unnormalized.

This is in essence correct, but it implies that things mainly
depend on how the user types them. This is very much NOT the
case. Whether the user types some accents with modifier keys
(in some cases called dead keys), some shift combination, or
a predefined key for that accented character is independent
of whether these characters enter the system as precomposed
or depomposed. Microsoft and most Unix/Linux systems use
precomposed characters, so that's what an application gets
from the keyboard driver and related machinery. The Mac
uses decomposed, so there, that's what you get.

Also, as far as I know (the Mac may be an exception), the
data is not usually normalized or checked for normalization.
In general, that is not necesary, because the keyboard driver
already takes care of this. But if the user e.g. enters
some non-normalized characters from a character picker or so,
then these enter the datastream as they are, unchecked.

Regards,    Martin.


>For the first, it would clearly be silly for the internals of an
>IDNA implementation to spend energy converting to NFC (although
>as Mark has pointed out, I think on this list, the check that
>the string is in NFC form is sufficiently simple and quick that
>one might make it in the name of robustness).  For the second,
>the spec requires that the application get the string into NFC
>form before looking it up, but one would assume that would be
>fairly natural.  As you implicitly point out, lookups of any
>form would generally be expected to fail unless normalized
>strings were compared to normalized strings.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     



More information about the Idna-update mailing list