<html>
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 8/10/2014 9:44 PM, Patrik Fältström
wrote:<br>
</div>
<blockquote
cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"
type="cite">
<pre wrap="">On 10 aug 2014, at 21:15, Asmus Freytag <a class="moz-txt-link-rfc2396E" href="mailto:asmusf@ix.netcom.com"><asmusf@ix.netcom.com></a> wrote:
</pre>
<blockquote type="cite">
<pre wrap="">The most important is the ability to create equivalence classes among
code point (and sequences), known as variant sets.
</pre>
</blockquote>
<pre wrap="">
Variants have nothing to do with the equivalence that normalization does, and you can never ever replace lack of normalization with an equivalence set.</pre>
</blockquote>
<br>
How so? <br>
<br>
<br>
<blockquote
cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"
type="cite">
<pre wrap="">
As John has explained, the issue here is that we have two set of representations that might be treated the same, without any normalization that say they are equivalent.
IETF has decided that IETF is to follow the rules that Unicode Consortium has created.
This basic rule lead to the change in IDNA2008 that ß is not to be treated the same as 'ss', as they where equivalent in IDNA2003 due to case folding rules (one of the things removed to IDNA2008 so that A-label and U-label are 1:1 mappings and translation between the two is reversible).
IDNA do have a mechanism for exceptions, and the whole idea for that is that we should be able to have these discussions.</pre>
</blockquote>
<br>
IDNA's exception mechanism cannot actually amend normalization and
force something like ß to be treated the same as 'ss'. However, it
could (in principle) disallow either of them. Because it is not
fundamentally an exception on or extension of normalization, but an
exception on repertoire.
<blockquote
cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"
type="cite">
<pre wrap="">
So can we please stay with this discussion on what is to be used in DNS?</pre>
</blockquote>
<blockquote
cite="mid:D5736458-9954-42B9-9C50-61CEA83D5B0D@frobbit.se"
type="cite">
<pre wrap="">
Variants have nothing to do with that. Variants have to do with *registration*policy*for the root zone, and then maybe a few TLDs.
Nothing else.</pre>
</blockquote>
<br>
I think it is useful to distinguish the technology from the
implementation. And it's useful and illuminating to consider what
that technology can deliver that is different from the exception
mechanism in IDNA. And, of course, also whether it has technical
drawbacks.<br>
<br>
From a technical point (blocked) variants are relatively similar to
normalization. Both define equivalence sets, but one leaves the
choice of "preferred" element open, while the other doesn't.<br>
<br>
I'm tacitly assuming that we are considering only variants of the
homoglyph/homograph nature, because the other kind(s) are a
different kettle of fish altogether.<br>
<br>
On top of any technical differences comes the differences regarding
who implements variants vs. exceptions.<br>
<br>
Given those differences, the following quote from a parallel thread
is illuminating:<br>
<br>
<blockquote type="cite">
<div class="gmail_default"><font face="times new roman, serif">For
example, on Mon, Dec 22, 2008 at 8:03 AM, John C Klensin <span
dir="ltr"><<a href="mailto:klensin@jck.com"
target="_blank">klensin@jck.com</a>></span> wrote:</font></div>
<div class="gmail_default"><font face="times new roman, serif">...<br>
</font>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><font
face="times new roman, serif"><span style="color:rgb(0,0,0)">(i)
What is, and is not, look-alike, is a very subjective</span><br
style="color:rgb(0,0,0)">
<span style="color:rgb(0,0,0)">business. </span></font></blockquote>
<div><font face="times new roman, serif">... </font></div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div id=":2nq" class="" style="overflow:hidden"><font
face="times new roman, serif">The bottom line is that
we've concluded that character<br>
combinations that are specifically phishing issues should
be<br>
dealt with by registries, who presumably know what they
are<br>
doing with scripts they choose to support, and by
application<br>
implementers who can warn people against hazardous
combinations<br>
(and potentially against registries who persistently
permit<br>
registration of strings that have no real value other than
to<br>
create phishing opportunities. </font></div>
</blockquote>
<div><font face="times new roman, serif">... </font></div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><font
face="times new roman, serif"><span style="color:rgb(0,0,0)">These
decisions were the result of explicit (and quite lengthy)</span><br
style="color:rgb(0,0,0)">
<span style="color:rgb(0,0,0)">discussion, not an
"oversight".</span></font></blockquote>
</div>
</blockquote>
<br>
Reading that, I would expect the IDNA protocol's exception mechanism
to be used in places where the issues are either so universal or so
grave as to warrant baking the solution in at the front end. And to
defer to other mechanisms available to registries (such as variants)
to handle less clear cut cases that are not of universal concern
(but still concerns).<br>
<br>
I believe that is a very proper discussion to have.<br>
<br>
Given the facts of the case, that the sequences and singleton in
question are relatively obscure, the singleton being encoded later,
but potentially the more practically useful one, and the existence
of many parallel cases that were not addressed via exception
mechanism, given all these facts, I am somewhat doubtful whether the
current case meets the criteria of importance and consistency that
would require it being addressed in IDNA.<br>
<br>
The more I learn about the particulars of the case, the more I keep
thinking that (despite looking like a normalization problem) it
really isn't and is more like the class of problem addressed in
John's quote from 2008.<br>
<br>
A./<br>
<br>
</body>
</html>