Concerns about the "szett" exception

Mon Oct 26 15:15:48 CET 2009

The Unicode Consortium shares your concerns about the treatment of
deviations, and the security and interoperability issues resulting from that
and custom mappings. Unfortunately, while those points were raised
consistently during the development of IDNA2008 (some would say too
persistently), the working group decided on its current course.

We have been consulting with browser and search engine vendors (many of whom
are members of the consortium), and I would anticipate that most will not
end up implementing IDNA2008 lookup as is because of the problems it has.
TR46 is designed by those needing to implement IDNA lookup so as to provide
a bridge specification, whereby implementations can maximize compatibility
with IDNA2003 and IDNA2008 on the lookup side, and avoid these problems. On
the "dual lookup" and "trusted registries" point that you mention: the text
of TR46 is insufficiently clear. That section is discussing alternative
approaches that were considered, but discarded (because they don't work
well, as Marcos pointed out in detail). I'll make sure that that feedback is
brought into the committee.

TR46 is not really aimed at the registry side. It is feasible for registries
to implement IDNA2008 if they additionally DISALLOW the four deviations
(including es-zett). This can be done while being conformant to IDNA2008,
because registries can further limit the characters they support.

Mark

On Mon, Oct 26, 2009 at 05:15, Alexander Mayrhofer <
alexander.mayrhofer at nic.at> wrote:

>
> All,
>
> As you probably know, i'm working for the Austrian Domain Name Registry
> (nic.at). I've recently prepared a presentation to our board regarding
> the changes to expect from IDNAbis deployment, and I've been asked by
> our board to voice our concerns about the "szett" (U+00DF) exception in
> the current document set. I understand that the documents have
> progressed very far, and that we should have voiced our concerns earlier
> - however, i think that the information below is still valuable to the
> group.
>
> Obviously, the DNS is an extremely important identity and naming system
> that is crucial to the operation of nearly all internet applications.
> Therefore, any changes to that structure are delicate operations. This
> is important for the creation of new portions of namespace, but
> particularly important when the semantics of a namespace (portion) are
> changed. The introduction of IDNA2003 was an extension of the namespace,
> at least from the application perspective (technically, it was changing
> the definition of an awkward-enough portion of the namespace, namely
> labels with "xn--").
>
> Changing the semantics of a certain namespace is *really bad*, and i
> agree to what Marcos said long time ago "Breaking backwards
> compatibility is to my eyes the big stigma of IDNA2008".
>
> I understand and welcome the introduction of rigid rules in IDNAbis as
> the primary mechanism to identify copepoint classification and protocol
> validity. Independence from a certain Unicode revision ensures a stable
> specification, and should create few "surprises" (essentially, it shifts
> responsibility of character classification from the IETF to Unicode). I
> also understand and welcome the 1:1 relation on the protocol level
> between A-label and U-label.
>
> However, the introduction of *exceptions* that work around those rigid
> rules, and particularly changing the semantics of a part of a deployed,
> used namespace is *really really bad* - particularly if the exception
> concerns such a "weird" character as the "szett" (Unicode folding-wise).
> Such changes generally  have the potential to change the resolved
> destination for a certain domain name, which in turn creates *major*
> security issues, and hurts interopability badly, because unlike the
> introduction of IDN2003, where a label would either work or not, those
> exceptions now create a situation where such a label would resolve to
> either destination A (old application), destination B (new application).
>
> I understand that the Rationale document proposes sensible approaches in
> Section 7.2 - however, i think the security issues could discuss the
> problems more explicitely, rather than just referring to the rationale
> document (which is informational anyways). I think that the sentence
>
>   "...a few characters that were mapped to others in the earlier
> version;
>   zone administrators should be aware of the problems that might raise
>   and take appropriate measures"
>
> In the definitions document could easily be overlooked by implementors.
>
> Another issue makes it even harder for zone administrator to deal with
> the problem: Actually *encouraging* application developers to create
> their own fancy mapping definitions, beyond the mappings that were
> included in IDNA2003 allows for even more "variations", and are bound to
> hurt interopability badly. One example of this is the Unicode TR46,
> particularly the proposal of "dual lookups" and "trusted registries" for
> "Deviations", which i believe to be a really really bad idea - but what
> are the other options?
>
> Shifting the responsibility of mapping, and therefore allowing for
> creating a myriad of mapping options to application developers seems
> risky to me, particularly for the Exception codepoints for which
> protocol definitions have changed between the two versions. From my
> point of view, it makes such codepoints unusable - the "mapping du jour"
> of application X could be entirely different than that of application Y.
>
> The Mapping draft says that it's "unusual" for the IETF to disucss user
> input processing steps - but on the other hand, Section 2.1 of RFC 3761
> (the ENUM base specification) clearly provides normative text about how
> user input should be prepared for a protocol (and i'm sure there are
> many other examples). So it seems the IETF *is* concerned about how user
> input is mapped to protocol elements.
>
> To sum up, we would have preferred the "szett" (U+00DF) to be kept
> "DISALLOWED", and to have the IETF describe the mapping procedures not
> just "Informational" (The contents of the mapping document itself is
> perfectly fine). We also hope that the IETF liases with application
> developers, particularly browser vendors, to establish one single "de
> facto" mapping procedure, so that at least the szett does not become a
> moving target.
>
> Thanks,
>
> Alex Mayrhofer
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20091026/a3080bf0/attachment-0001.htm