U-labels, NFC, and symmetry
stpeter at stpeter.im
Fri Apr 15 18:36:59 CEST 2011
On 4/15/11 10:15 AM, John C Klensin wrote:
> --On Friday, April 15, 2011 09:52 -0600 Peter Saint-Andre
> <stpeter at stpeter.im> wrote:
>>> If what you're saying is that you want a definition of
>>> D-compatible U-label, I am not sure whether that is practical.
>> I (well, the XMPP folks) *might* want a D-compatible
>> domaineything. I think we've already determined that such a
>> thing would not be a U-label.
> The conformance test for a D-compatible domaineything would be:
> (1) Verify that it is in NFD form (otherwise, it isn't
> (2) Convert it to NFC
> (3) Apply the operations of RFC 5891 to the rules specified in
> 5892 (or to tables derived from them) to verify that the result
> of (2) is a U-label.
> There isn't any practical way to apply 5891/5892 to an
> NFD-conformant string without first converting it to NFC. The
> rules and algorithms just aren't constructed that way.
> And, of course, converting an A-label to a U-label yields an
> NFC-conformant string that you would then have to convert to an
> NFD string for your purposes.
> That is not impossible. Given your earlier analysis, it would
> be a small marginal cost for a relatively rare case. Keeping
> things straight (remember what is in which form) either requires
> that implementers be very careful --much more careful than if
> everything were in the same form-- or a lot of testing to be
> sure they had gotten it right, but possibly XMPP implementers
> are significantly more careful on average than we usually see
> with applications.
> But, modulo a potential issue with characters newly-added to
> Unicode, I still don't see the case for NFD: it certainly
> doesn't make string comparisons any easier.
Thanks for the input. I shan't post further in this thread until I've
had a chance to think about things some more and check with some of the
implementers in the XMPP community.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 6105 bytes
Desc: S/MIME Cryptographic Signature
More information about the Idna-update