sharp s (Eszett)
Erik van der Poel
erikv at google.com
Fri Mar 7 21:52:40 CET 2008
Section 5 of the protocol-04 draft starts with a discussion of
non-Unicode encodings, then Unicode strings and so on, until section
5.5, where it says that the U-label is converted to its A-label form.
Where in the IDNA200X documents does it say what a resolver is
supposed to do with a label that is *already* in A-label form at the
beginning? If it is not in section 5, should it be?
On Fri, Mar 7, 2008 at 12:22 PM, John C Klensin <klensin at jck.com> wrote:
> --On Friday, 07 March, 2008 11:51 -0800 Michel Suignard
> <michelsu at windows.microsoft.com> wrote:
> > Again, when I read
> > http://tools.ietf.org/html/draft-klensin-idnabis-protocol-04#s
> > ection-5.4 I see a set of rules that will require application
> > to update when the repertoire grows. I am not exactly excited
> > about it either, but it just means that browsers need some
> > sort of self-update mechanism that most of them have anyway
> > (including IE7). It is fairly clear that every browser (afaik)
> > will require a serious patch to move from IDNA2003 to IDNA200x
> > anyway (the bidi behavior change being another breaking
> > change), so this does not create a specific issue on its own.
> The key reasons for that update requirement is precisely the one
> that Erik identified: one cannot know for sure what properties a
> codepoint that might be assigned to a previously-unknown
> location will have with regard to casefolding, compatibility and
> canonical compositions and decompositions, and so on. While we
> all hope that there are enough reserved positions to head off
> problems, one cannot even know for certain that a newly-assigned
> codepoint will not have Right-to-Left properties that would
> invoke the special bidi treatment.
> On the other hand, those updates are not needed all at once. If
> the main change between Unicode 7.1 and 7.2 is the introduction
> of several scripts for Martian languages, and you don't do
> business on Mars or with multiplanetary Martian firms, there
> might be few people who care how rapidly you incorporate those
> changes. The only firm requirement is that, at any given time
> and with any given version of Unicode, your Unicode tables and
> your IDNA ones reflect consistent versions.
> Finally, assuming Erik's tests are correct (and I understand
> them), MSIE7 is already being cautious about looking up
> unassigned code points, putting it ahead of the curve on
> IDNA200X rather than requiring retrofitting in that area.
More information about the Idna-update