<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">On 8/10/2014 9:44 PM, Patrik Fältström

      wrote:<br>

    </div>

    <blockquote

      cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"

      type="cite">

      <pre wrap="">On 10 aug 2014, at 21:15, Asmus Freytag <a class="moz-txt-link-rfc2396E" href="mailto:asmusf@ix.netcom.com"><asmusf@ix.netcom.com></a> wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">The most important is the ability to create equivalence classes among

code point (and sequences), known as variant sets.

</pre>

      </blockquote>

      <pre wrap="">

Variants have nothing to do with the equivalence that normalization does, and you can never ever replace lack of normalization with an equivalence set.</pre>

    </blockquote>

    <br>

    How so? <br>

    <br>

    <br>

    <blockquote

      cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"

      type="cite">

      <pre wrap="">

As John has explained, the issue here is that we have two set of representations that might be treated the same, without any normalization that say they are equivalent.

IETF has decided that IETF is to follow the rules that Unicode Consortium has created.

This basic rule lead to the change in IDNA2008 that ß is not to be treated the same as 'ss', as they where equivalent in IDNA2003 due to case folding rules (one of the things removed to IDNA2008 so that A-label and U-label are 1:1 mappings and translation between the two is reversible).

IDNA do have a mechanism for exceptions, and the whole idea for that is that we should be able to have these discussions.</pre>

    </blockquote>

    <br>

    IDNA's exception mechanism cannot actually amend normalization and

    force something like ß to be treated the same as 'ss'. However, it

    could (in principle) disallow either of them. Because it is not

    fundamentally an exception on or extension of normalization, but an

    exception on repertoire.

    <blockquote

      cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"

      type="cite">

      <pre wrap="">

So can we please stay with this discussion on what is to be used in DNS?</pre>

    </blockquote>

    <blockquote

      cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"

      type="cite">

      <pre wrap="">

Variants have nothing to do with that. Variants have to do with *registration*policy*for the root zone, and then maybe a few TLDs. 

Nothing else.</pre>

    </blockquote>

    <br>

    I think it is useful to distinguish the technology from the

    implementation. And it's useful and illuminating to consider what

    that technology can deliver that is different from the exception

    mechanism in IDNA. And, of course, also whether it has technical

    drawbacks.<br>

    <br>

    From a technical point (blocked) variants are relatively similar to

    normalization. Both define equivalence sets, but one leaves the

    choice of "preferred" element open, while the other doesn't.<br>

    <br>

    I'm tacitly assuming that we are considering only variants of the

    homoglyph/homograph nature, because the other kind(s) are a

    different kettle of fish altogether.<br>

    <br>

    On top of any technical differences comes the differences regarding

    who implements variants vs. exceptions.<br>

    <br>

    Given those differences, the following quote from a parallel thread

    is illuminating:<br>

    <br>

    <blockquote type="cite">

      <div class="gmail_default"><font face="times new roman, serif">For

          example, on Mon, Dec 22, 2008 at 8:03 AM, John C Klensin <span

            dir="ltr"><<a href="mailto:klensin@jck.com"

              target="_blank">klensin@jck.com</a>></span> wrote:</font></div>

      <div class="gmail_default"><font face="times new roman, serif">...<br>

        </font>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><font

            face="times new roman, serif"><span style="color:rgb(0,0,0)">(i)

              What is, and is not, look-alike, is a very subjective</span><br

              style="color:rgb(0,0,0)">

            <span style="color:rgb(0,0,0)">business. </span></font></blockquote>

        <div><font face="times new roman, serif">... </font></div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

          <div id=":2nq" class="" style="overflow:hidden"><font

              face="times new roman, serif">The bottom line is that

              we've concluded that character<br>

              combinations that are specifically phishing issues should

              be<br>

              dealt with by registries, who presumably know what they

              are<br>

              doing with scripts they choose to support, and by

              application<br>

              implementers who can warn people against hazardous

              combinations<br>

              (and potentially against registries who persistently

              permit<br>

              registration of strings that have no real value other than

              to<br>

              create phishing opportunities.  </font></div>

        </blockquote>

        <div><font face="times new roman, serif">... </font></div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><font

            face="times new roman, serif"><span style="color:rgb(0,0,0)">These

              decisions were the result of explicit (and quite lengthy)</span><br

              style="color:rgb(0,0,0)">

            <span style="color:rgb(0,0,0)">discussion, not an

              "oversight".</span></font></blockquote>

      </div>

    </blockquote>

    <br>

    Reading that, I would expect the IDNA protocol's exception mechanism

    to be used in places where the issues are either so universal or so

    grave as to warrant baking the solution in at the front end. And to

    defer to other mechanisms available to registries (such as variants)

    to handle less clear cut cases that are not of universal concern

    (but still concerns).<br>

    <br>

    I believe that is a very proper discussion to have.<br>

    <br>

    Given the facts of the case, that the sequences and singleton in

    question are relatively obscure, the singleton being encoded later,

    but potentially the more practically useful one, and the existence

    of many parallel cases that were not addressed via exception

    mechanism, given all these facts, I am somewhat doubtful whether the

    current case meets the criteria of importance and consistency that

    would require it being addressed in IDNA.<br>

    <br>

    The more I learn about the particulars of the case, the more I keep

    thinking that (despite looking like a normalization problem) it

    really isn't and is more like the class of problem addressed in

    John's quote from 2008.<br>

    <br>

    A./<br>

    <br>

  </body>

</html>