Prohibiting mapping of PVALID characters
jean-michel bernier de portzamparc
jmabdp at gmail.com
Thu Dec 10 03:33:24 CET 2009
fascinating is the way our Unicode fellow Members are going to "oppose the
charter, disregard a consensus, reverse a 80% feed back, force the Chair to
"change his mind" and most probably its consensus evaluation, imposing their
interest to their commercial competition, and writing themseleves a paper
legitimacy against the demand of their own users they most probably are
certain they can influence in the same manner".
Also fascinating to realize that all this small group of "60 Hz" people
effort is void. The 60hzians' strategy already failed on TATWEEL. This was
due to the reasonable positions of Siavash Shahshahani and Roozbeh
Pournader. Users' interest and number and technical influence is moving to
the workon at idna2010.org subscribers diversity.
NB. Microsoft's position regarding the French language has now been archived
as an important pre-experimentation input
2009/12/10 Kenneth Whistler <kenw at sybase.com>
> > >Protocol, 5.2:
> > >
> > >5.2. Conversion to Unicode
> > >
> > >The string is converted from the local character set into
> > >Unicode, if it is not already in Unicode. Depending on local
> > >needs, this conversion may involve mapping some characters
> > >into other characters as well as coding conversions...
> > >The results MUST be a Unicode string in NFC form.
> > >
> > >
> > >Strings don't magically get to be "in NFC form", without
> > >being mapped (via normalization algorithm) from whatever form
> > >they started out as, *into* NFC form.
> > In this case, yes they do. That "MUST" is probably wrong; I believe
> > that the statement is meant to say "The results will be a
> > Unicode string in NFC form".
> > John, et. al.: is my understanding correct here?
> I repeat: strings do not turn into NFC form by magic. They
> are mapped by the normalization algorithm. And that mapping
> involves, necessarily, PVALID --> PVALID characters in
> this case.
> And "the results" will not be a "Unicode string in NFC form"
> unless it is mapped to that. (Except of course, by accident,
> if it happens to start out as NFC in the first place.)
> I think what you may be trying to get at is a requirement that
> the only valid *input* to the label processing is a string
> which is already a Unicode string in NFC form -- and that
> how it got to be that way and indeed whether it started out
> as a string in SJIS or 8859-7 or whatever, is beyond the
> protocol's scope of caring.
> But if that is the case, then why are we talking about trying
> to add into Section 5.2 a prohibition (a MUST NOT) against
> mapping PVALID characters? Because manifestly, for any
> actual implementation to meet the requirement of having
> Unicode strings in NFC format as valid input for the label
> processing, it MUST map strings (including PVALID
> characters) using the Unicode normalization algorithm.
> I understand that maybe you want to say that that isn't
> a requirement *in* the protocol -- it is simply a requirement
> on the well-formedness of valid input to be handled as
> labels. And maybe I don't understand how you distribute
> the MUSTs, SHOULDs and wills around the document to accomplish
> But my essential point here is that you cannot have you cake
> and eat it too -- trying to prohibit mapping of PVALID
> characters in Section 5.2 at the same time that you
> are requiring mapping of PVALID characters in Section 5.2 --
> however the exact protocol wordsmanship of that gets
> worked out.
> Maybe you can, for example, prohibit *case* mapping of PVALID
> characters in Section 5.2, while requiring canonical
> normalization mapping of PVALID characters to NFC. That
> at least would be a coherent position. But you cannot
> just prohibit mapping and require mapping of the same
> class of characters, without being more discriminating
> in what you mean.
> Idna-update mailing list
> Idna-update at alvestrand.no
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update