Changing the xn-- prefix

Simon Josefsson simon at josefsson.org
Wed Mar 19 11:20:35 CET 2008


Martin Duerst <duerst at it.aoyama.ac.jp> writes:

> At 06:11 08/03/19, Simon Josefsson wrote:
>>Andrew Sullivan <ajs at commandprompt.com> writes:
>>
>>> Dear colleagues,
>>>
>>> On Tue, Mar 18, 2008 at 05:42:33PM +0100, Simon Josefsson wrote:
>>>>    Even if they wanted to do so, all registries could not convert all
>>>>    IDNA2003 ("xn--") registrations to a new form at the same time
>>>> 
>>>> I don't see why registries would need to convert anything at the same
>>>> time?  Supporting IDNABIS will be a gradual process for the few
>>>> registries that support IDNA2003 today.  I don't think any registry will
>>>> support IDNABIS the same day it is published.  There is no change
>>>> everything at the same time.
>>>
>>> I no longer work for a registry; but having been on the pointy end of
>>> that data stick once before, I am pretty sure nobody in the registry
>>> business will want to support both systems at once.  Explaining how
>>> all of the current approach is supposed to work even for incredibly
>>> simple cases, like German, takes much more effort than those who are
>>> participating in this discussion might believe.  I don't want to
>>> prejudge that the prefix must not change, but I sure want to be clear
>>> that changing the prefix almost certainly means a flag day, and may
>>> well cause currently-operational registrations to become invalid.
>>
>>Given ゜, I'm not sure German is a simple case.
>>
>>I think Norway would be a simpler case, since they only permit
>>registration of domains with non-ASCII characters mentioned in the list
>>at <http://www.norid.no/domeneregistrering/idn/idn_nyetegn.html>.
>>Assuming those characters are unaffected by Unicode 3.2 to Unicode 5.0
>>migration, I don't see a major problem to allow registration of both the
>>IDNA2003 form and the IDNABIS form, for every IDN string.  The
>>conversion from IDNABIS to IDNA2003 may for .NO be a simple
>>s/^xn--/xp--/.
>
> I think this points to a reason why a new prefix is costly:
> In a case such as Norway, they'd add a new prefix (some cost)
> essentially for nothing (nothing at all is changing for them).

The advantage would be to support IDNA200x applications.

> In a case such as Germany (let's assume sharp s is going to
> be allowed in IDNABIS), they'll have to figure out the exact
> relationship anyway, and they'll have to do it independently
> of whether there is a new prefix or not. And the new prefix
> actually won't really help them (the fact that xn--sharp-s
> names exist if the prefix won't be changed won't bother
> IDNA2003 applications because they never look for it).

Names are not just going to be looked up, they will be embedded into
certificates and in other ways transferred to the clients in stored
form.  If xn--sharp-s is handed to a IDNA2003 application, it won't
display as ß and IDNA2003-ToASCII(xn--sharp-s) == xn--sharp-s, which
will fail string-comparison against the string 'ß'.  This seems like a
serious security problem for any IDNA2003 application to me.

Sure, changing the prefix will not solve this problem completely either,
but it allows the stored forms to contain both the IDNA2003 and IDNA200x
form during a transition period.  The certificate can contain both the
'ss' form (i.e., IDNA2003-ToASCII(ß) == ss), and the IDNA200x form with
the new prefix (i.e., xp--sharp-s).  Then string comparison will work
for IDNA2003 implementations and IDNA200x implementations.  It would
also allow people that haven't deployed IDNA2003 to just skip dealing
with the xn-- prefix, and just get on with the new prefix.

> In my view, a new prefix would only be necessary if there are
> any cases where xn--foo and xp--foo both exist but won't
> map to the same data (IP address). And any such case would
> be a problem in and by itself.

Indeed, and I don't follow why that would be useful.

I believe the main reason for changing the prefix would be to avoid
confusion in IDNA200x implementation whether the string was generated by
a IDNA2003 implementation or a IDNA200x implementation, because those
two implementations will not necessarily generate the same output string
given the same input, and having that confusion leads to security
problems.

> So indeed a new prefix is quite a bit of cost for probably
> essentially no gain.

In any case, I think this thread have shred more light on this, and that
hopefully some of it will be integrated into the rationale document so
the it becomes more convincing.

/Simon


More information about the Idna-update mailing list