sharp s (Eszett)

Erik van der Poel erikv at
Fri Mar 7 21:52:40 CET 2008

Section 5 of the protocol-04 draft starts with a discussion of
non-Unicode encodings, then Unicode strings and so on, until section
5.5, where it says that the U-label is converted to its A-label form.

Where in the IDNA200X documents does it say what a resolver is
supposed to do with a label that is *already* in A-label form at the
beginning? If it is not in section 5, should it be?


On Fri, Mar 7, 2008 at 12:22 PM, John C Klensin <klensin at> wrote:
>  --On Friday, 07 March, 2008 11:51 -0800 Michel Suignard
> <michelsu at> wrote:
> > Again, when I read
>  >
>  > ection-5.4 I see a set of rules that will require application
>  > to update when the repertoire grows. I am not exactly excited
>  > about it either, but it just means that browsers need some
>  > sort of self-update mechanism that most of them have anyway
>  > (including IE7). It is fairly clear that every browser (afaik)
>  > will require a serious patch to move from IDNA2003 to IDNA200x
>  > anyway (the bidi behavior change being another breaking
>  > change), so this does not create a specific issue on its own.
>  Yes.
>  The key reasons for that update requirement is precisely the one
>  that Erik identified: one cannot know for sure what properties a
>  codepoint that might be assigned to a previously-unknown
>  location will have with regard to casefolding, compatibility and
>  canonical compositions and decompositions, and so on.  While we
>  all hope that there are enough reserved positions to head off
>  problems, one cannot even know for certain that a newly-assigned
>  codepoint will not have Right-to-Left properties that would
>  invoke the special bidi treatment.
>  On the other hand, those updates are not needed all at once.  If
>  the main change between Unicode 7.1 and 7.2 is the introduction
>  of several scripts for Martian languages, and you don't do
>  business on Mars or with multiplanetary Martian firms, there
>  might be few people who care how rapidly you incorporate those
>  changes.  The only firm requirement is that, at any given time
>  and with any given version of Unicode, your Unicode tables and
>  your IDNA ones reflect consistent versions.
>  Finally, assuming Erik's tests are correct (and I understand
>  them), MSIE7 is already being cautious about looking up
>  unassigned code points, putting it ahead of the curve on
>  IDNA200X rather than requiring retrofitting in that area.
>      john

More information about the Idna-update mailing list