Prohibiting mapping of PVALID characters

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Thu Dec 10 08:02:35 CET 2009


Hello Andrew,

I think the confusion you talk about below comes from the fact that 
PVALID is defined on single characters, but NFC (as a form) is defined 
on sequences of characters. What we require for U-Labels (among else) is 
that they only contain PVALID characters AND that they are in NFC. The 
only effect that applying NFC (as a conversion) can have for a sequence 
of PVALID characters is that it may create a sequence that contains 
PVALID only but is not in NFC into a sequence that contains PVALID only 
but is also in NFC. The former isn't an U-label, the later is.

[For everybody, I think it's important to understand that (as far as I 
know, Ken or Mark please correct me if I'm wrong), none of the two (to 
four) characters we are currently wrestling with is in anyway involved 
in NFC.]

Here are some additional things that we might want to check just to be 
sure we understand things:

a) There may be sequences of PVALID and not PVALID characters, than when 
applying NFC, turn into sequences of PVALID only. If there are such 
cases, I think they should be allowed, and our text shouldn't forbid 
such a transformation.

b) There may be a way to apply NFC to a sequence of PVALID characters 
only and the result would contain some non-PVALID characters. If that's 
the case, those characters might further be mapped to something; would 
we be okay with that?

Regards,   Martin.

On 2009/12/10 13:05, Andrew Sullivan wrote:
> On Wed, Dec 09, 2009 at 05:10:42PM -0800, Kenneth Whistler wrote:
>> I repeat: strings do not turn into NFC form by magic. They
>> are mapped by the normalization algorithm. And that mapping
>> involves, necessarily, PVALID -->  PVALID characters in
>> this case.
>
> Maybe there's something wrong with the definition of PVALID?  If the
> initial PVALID-but-not-NFC character has to be mapped, because the
> strings must all be "in NFC form", then it's hard to understand how
> such a character is in fact valid under the protocol.
>
> (To be clear, I'm not complaining about what you're saying.  You've
> made much more pointed the sort of thing that was nagging at me
> earlier today.)
>
> A
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp


More information about the Idna-update mailing list