Changing the xn-- prefix
simon at josefsson.org
Tue Mar 18 19:56:36 CET 2008
Is there a reason you don't use the ACE string in the canonical form?
Using a decoded form seems prone to errors: if you happen to have a bug
in the decoder from the wire-form to your internal canonical form, you
will have a database migration problem. The ACE string is what is used
on the wire, and the ACE string is what is published by the information
producer, so using it would never be incorrect, if I understand
If IDNABIS keep the same prefix, but make backwards incompatible
changes, it seems you will have the same problem. Arguable the problem
may be much smaller if only PR-29 strings are affected. But if ß != ss
in IDNABIS your problem could be significant.
"Mark Davis" <mark.davis at icu-project.org> writes:
> Changing the prefix would be really nasty. For folks like us at Google, it
> is important to have a canonical form for URLs, and then map to the
> on-the-wire form. When we have a domain name with Unicode characters in it,
> which do we pick when we want to go out on the wire? You would really have
> to have very strict enforcement of the policy on registries that no matter
> what, where both the IDNA2003 and IDNA200x forms were both valid, that both
> lead to the same location. Is that really practical?
> On Tue, Mar 18, 2008 at 9:42 AM, Simon Josefsson <simon at josefsson.org>
>> Harald Tveit Alvestrand <harald at alvestrand.no> writes:
>> > Simon,
>> > Simon Josefsson skrev:
>> >> I note that using a new prefix instead of xn-- would avoid this
>> >> Specifications and implementations that use IDNA2003 continue to use
>> >> xn-- and will work fine within its limitations. New specifications and
>> >> implementations that support IDNABIS will use another prefix and also
>> >> work fine.
>> >> I'm not suggesting we adopt this approach, but I haven't
>> >> seen the disadvantages of changing the prefix clearly expressed yet.
>> >> There is a cost in maintaining both IDNA2003 and IDNABIS encodings of
>> >> strings during a transition-period. Whether that cost is higher or
>> >> lower than the complexity in re-using the old prefix for something that
>> >> won't be fully backwards compatible is not clear to me.
>> > section 9.3.3 of draft-klensin-idnabis-issues-07 tries to describe (in
>> > just a few sentences) the cost drivers that (I think) makes a prefix
>> > change a very expensive proposition, both in terms of work for the DNS
>> > operators and in terms of ongoing execution-time costs of application.
>> > That argument convinced me; if you find any part of that unclear, or
>> > disagree with the conclusions, feedback on the text would be welcome.
>> The section didn't convince me. It seems to repeatedly assert that the
>> costs are "considerable" without going into technical details.
>> There are some claims that look substantive:
>> Even if they wanted to do so, all registries could not convert all
>> IDNA2003 ("xn--") registrations to a new form at the same time
>> I don't see why registries would need to convert anything at the same
>> time? Supporting IDNABIS will be a gradual process for the few
>> registries that support IDNA2003 today. I don't think any registry will
>> support IDNABIS the same day it is published. There is no change
>> everything at the same time.
>> systems that needed to support both labels
>> with old prefixes and labels with new ones would first process a
>> putative label under the IDNA200X rules and try to look it up and
>> then, if it were not found, would process the label under IDNA2003
>> rules and look it up again.
>> IDNABIS could say that for backwards compatible reasons, when you create
>> a domain xp--foo in your zone (for some non-ASCII string), the software
>> needs to make sure there is a xn--foo for the corresponding IDNA2003
>> name too, if there is an equivalent IDNA2003 name.
>> Yes, this require some special text intended for people creating and
>> maintaining zone files. However, such text is need anyway. The process
>> of populating a zone file for non-ASCII domains is complicated and there
>> are many fine details that cause problems.
>> That process could significantly slow down all processing that
>> involved IDNs in the DNS especially since, in principle, a
>> fully-qualified name could contain a mixture of labels that were
>> registered with the old and new prefixes, a situation that would make
>> the use of DNS caching very difficult.
>> That is false for the CNAME approach.
>> In addition, looking up the same input string as two separate
>> A-labels would create some potential for confusion and attacks, since
>> they could, in principle, resolve to different targets.
>> This threat doesn't seem applicable to the CNAME approach.
>> I'm not proposing that we should change the prefix here, but I'd like to
>> understand the disadvantages in doing so. There are some advantages:
>> Other backwards incompatible changes appear to be considered at this
>> point, such as using a newer Unicode version or changing how ß is
>> handled. It will be simple to make those other backwards incompatible
>> changes if we change the prefix.
>> Idna-update mailing list
>> Idna-update at alvestrand.no
> Idna-update mailing list
> Idna-update at alvestrand.no
More information about the Idna-update