sharp s (Eszett)

John C Klensin klensin at
Fri Mar 7 23:26:11 CET 2008

--On Friday, 07 March, 2008 12:52 -0800 Erik van der Poel
<erikv at> wrote:

> Section 5 of the protocol-04 draft starts with a discussion of
> non-Unicode encodings, then Unicode strings and so on, until
> section 5.5, where it says that the U-label is converted to
> its A-label form.
> ocol-04.txt
> Where in the IDNA200X documents does it say what a resolver is
> supposed to do with a label that is *already* in A-label form
> at the beginning? If it is not in section 5, should it be?

You are correct.  The document basically assumes that all
validity checking gets done on the U-labels (or as part of the
U-label definition).  "Rationale" essentially assumes that
A-labels that can't be obtained from valid U-labels are invalid
and vice versa, but the current text of "Protocol" doesn't
enforce that assumption.    In particular, that leaves a hole if
someone creates a funky A-label that could not have been formed
via the U-label process. I think that hole that needs to be
plugged, but am troubled about the case of non-IDNA-aware
applications in which it is impossible to state, much less
enforce, any sort of validity checks on A-labels.

I'm happy to leave a discussion of whether that hole should be
plugged, how, and how aggressively, to the proposed WG.  Given
the current charter model, I think it is pretty clear that this
question (as well as the question of whether or not certain
characters, Eszett included, should be accommodated) are
independent of the question of whether consensus can be reached
on moving forward with the general "IDNA200X" model.

Whether this proves that we need a WG or not, it certainly
demonstrates that there are still some loose ends and unresolved
issues in the existing documents.  I hope that no one has
claimed otherwise.


More information about the Idna-update mailing list