Unregistered code points and new prefixes (was: Re: sharp s(Eszett))

Erik van der Poel erikv at google.com
Sun Mar 9 17:49:36 CET 2008

On Sun, Mar 9, 2008 at 1:47 AM, Cary Karp <ck at nic.museum> wrote:
> Quoting Erik:
>  > Martin said it the way I would have, roughly, "Who are you trying to
>  > protect, and from what?" It does not make much sense to disallow the
>  > lookup of an invalid A-label by an IDNA-aware application, let alone
>  > an IDNA-unaware application.
>  >
>  > However, if the consensus of the group is to disallow invalid A-label
>  > lookups, then I will reluctantly go along with it. It does make
>  > migration to future versions of Unicode a bit harder than it needs to
>  > be. Michel said it quite well too, something like "I'm not very
>  > excited about it."
>  The gTLD registries are forbidden by the terms of their contracts with
>  ICANN from accepting the registration of any labels with hyphens in the
>  third and fourth positions unless an explicit further agreement has
>  been signed in which the registry binds itself to follow the ICANN IDN
>  Guidelines,
>         http://icann.org/topics/idn/idn-guidelines-26apr07.pdf
>  which do not permit prefixing random strings with xn--.

Thanks, Cary. That was certainly informative.

>  The ccTLD registries are not subject to the same across the board
>  constraint. The resulting user confusion is significant and the
>  retroactive imposition of remedial policies is not a viable means for
>  addressing this problem (nor is simply dismissing or tolerating it).

Do you have some suggestions for addressing this problem?


The core issue we are discussing in this thread, however, is that of
unassigned code points. A registry may register a label that is
perfectly valid at the time, if the code points are assigned in the
then-current version of Unicode and those code points follow the rules
of the IDNA200X standards (assuming they become RFCs at some point in
the future). A content provider (e.g. HTML) may even include a
reference to the new A-label; this is perfectly valid.

The question, however, is what an *old* client is permitted to do in
this instance. If the old client is based on an older version of
Unicode, is it allowed to look up a domain name with labels that are
*already* in A-label form? If not, hopefully the vendor of that old
client has a mechanism in place to update the client, either
automatically or through user intervention. Thankfully, the
proliferation of security attacks has motivated vendors to develop
such updating facilities.

Do mobile phones also have such updating facilities?


More information about the Idna-update mailing list