&gt; Many, including Arabic, Sanskrit and Dhivehi. Possibly Hebrew too. But &gt; &quot;leaving out&quot; may be an underspecified term here - see next comment.

<br><br>Your statement pretty much floored me. Before we remove the ability to use domain names from billions of people, it&#39;d be good to have solid, defensible reasons for doing so.<br><br>I really would like to get back to my original message, which was to try to get a solid problem statement so that we can assess what we are doing against that. John hasn&#39;t replied to Ken&#39;s suggestions for changes to 

<a href="http://www.ietf.org/internet-drafts/draft-klensin-idnabis-issues-00.txt" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">http://www.ietf.org/internet-drafts/draft-klensin-idnabis-issues-00.txt

</a>, and frankly even with those changes the document does not yet provide sufficient rationale for the steps it proposes. As I said, when we go out to fix an engineering problem, we need to have a clear statement of the problems, with example scenarios for each. Without the scenarios, you often don&#39;t get a clear idea from all parties as to what the issues really are, and why fixes need to be made.

You can then assess the options based on how well they handle the problems, and you have some concrete cases to look at, instead mistly, inchoate anxieties.<br><br>So I&#39;d really like feedback on the problem statement, included below. I made an update to #1 as per your message. I don&#39;t think at all that #2 and #4 are irrelevant -- you may have read them over too quickly. We need a good statement as to the problems that removing Arabic, for example, would fix AND why we can&#39;t solve the problem without removing Arabic. 

<br><br>Here is a restatement of what I see are the problems.<br><br>1. It is bound to a specific version of Unicode, and therefore does not allow the adoption of new scripts over time; in particular, it <span name="st">

does</span> <span name="st">not</span> <span name="st">allow</span> <span name="st">Unicode</span>  5.0 characters. Examples: see the updated section &quot;## Show a list of all the characters not currently allowed&quot; of 

<a href="http://www.macchiato.com/idn/UnicodePropertyResults.html">http://www.macchiato.com/idn/UnicodePropertyResults.html</a> . This takes the current proposal given by the rules we&#39;re working on, and shows the characters not permitted by the current IDNA. It currently amounts to 956 characters. (You may have to refresh your browser to see it.) Now, many of these are historic characters, and won&#39;t much matter to modern users, but many are in current, modern usage, and required by well-populated languages.

2. It restricts some combinations that are required for certain languages.

<br>&nbsp; a) Mn at the <span>end of BIDI fields; as in Dhivehi, see <a href="http://www.ietf.org/in">http://www.ietf.org/in</a></span>ternet-drafts/draft-alvestrand-idna-bidi-00.txt<br>&nbsp; b) ZWJ/NJ in limited contexts; see <a href="http://www.unicode.org/review/pr-96.html">

http://www.unicode.org/review/pr-96.html</a><br><br>3. There are concerns about the stability of normalization<br>(discussed elsewhere)<br><br>4. There are opportunities for spoofing. This breaks down into a number of sub-problems, of which the major ones are:

<br>

&nbsp; a) non-letter confusables; like fraction slash in <a href="http://amazon.com/badguy.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">amazon.com/badguy.com</a><br>&nbsp; b) confusable letters/numbers within mixtures of scripts; like cyrillic &#39;a&#39; in 

<a href="http://paypal.com">paypal.com</a>

<br>&nbsp; c) confusable letters in same script; like 

<a href="http://inte1.com/" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">inte1.com</a> If there are other problems beyond these, I&#39;d <span style="font-style: italic;">really like to know about them. Otherwise I can just forsee continuing confusion.

<br><br>Mark<br><br><div><span class="gmail_quote">On 12/19/06, <b class="gmail_sendername">Harald Alvestrand</b> &lt;

<a href="mailto:harald@alvestrand.no" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">harald@alvestrand.no</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Mark Davis wrote:<br>&gt;<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; If that is accepted as the problem definition, it is reasonable to<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; assume<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; that a solution does NOT lock us again into a fixed set of<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; scripts, but<br>

&gt;&nbsp;&nbsp;&nbsp;&nbsp; rather allows scripts to be added in an incremental fashion.<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; And if that is accepted, the option of disallowing a script &quot;until<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; we have<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; sorted out the identified issues&quot; becomes far less of an issue

<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; than it<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; seems to be regarded by Mark/Ken/Michel today<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; (apologies if I have mischaracterized a position here).<br>&gt;<br>&gt;<br>&gt;<br>&gt; I think the &quot;until we have sorted out the identified issues&quot; is too

&gt; vague to be a useful criterion. There is general consensus that there &gt; isn&#39;t any problem with leaving out the historic scripts (although, as &gt; I said, frankly it doesn&#39;t buy much in terms of reducing spoofing).

&gt; But which other scripts did you have in mind omitting, and on what &gt; grounds? Many, including Arabic, Sanskrit and Dhivehi. Possibly Hebrew too. But &quot;leaving out&quot; may be an underspecified term here - see next comment.

<br>&gt;<br>&gt; There is also a big difference between the flexibility in the protocol<br>&gt; vs that available to registries and user-agents. Suppose that in the<br>&gt; protocol we allow Hebrew, but recommend against (for some reason)

<br>&gt; final forms of letters. Registries and user-agents can then start by<br>&gt; following those recommendations, but if it turns out to be necessary<br>&gt; to allow them in (either fully or in limited circumstances), it is

<br>&gt; relatively easy for them to do so. Baking a prohibition against<br>&gt; final-forms of letters into the protocol is a much different matter --<br>&gt; it takes quite a while for everyone to update to a new version. (And

<br>&gt; during that time, I have no doubt that we will hear charges of<br>&gt; discrimination...)<br>&gt;<br>You may want to review draft-klensin-idnabis-issues again, and see at<br>which steps of the protocol we are thinking of switching to an

<br>inclusion-based model that starts off with a sharply limited set.<br><br>I think we are best served if we install the maximum amount of<br>restrictions initially in section 2.1.3 &quot;Registration of IDNs -<br>Permitted Character Identification&quot; (and therefore also in section

<br>2.1.5), while we should install a minmum set of restrictions in section<br>2.2.3 &quot;Domain Name Resolution - Pre-Nameprep Validation and Character<br>List Testing&quot;.<br>&gt;<br>&gt;&nbsp;&nbsp;&nbsp;&nbsp; I think.<br>&gt;<br>&gt;

<br>&gt;<br>&gt; Since you didn&#39;t comment on any of the other issues I wrote, does that<br>&gt; mean that you agree with them, or that you just hadn&#39;t gotten to them.<br>&gt; ;-)<br>It meant that I regarded them as irrelevant until we get this one

<br>settled, I think.<br><br></blockquote></div><br>