&gt; <a href="http://www.ietf.org/internet-drafts/draft-faltstrom-idnabis-tables-02.txt">http://www.ietf.org/internet-drafts/draft-faltstrom-idnabis-tables-02.txt</a><br><br>While the rule structure has improved in this version, there are a number of problems remaining. Rule H stands out as one of them.

<br><br><pre>&gt; 2.8.  Rule H - Stable scripts</pre><br><span class="gmail_quote">On 6/12/07, <b class="gmail_sendername">John C Klensin</b> &lt;<a href="mailto:klensin@jck.com">klensin@jck.com</a>&gt; wrote:</span><br><div style="margin-left: 40px;">

I intensely dislike having Rule H.&nbsp;&nbsp;I think that dislike is<br>shared by Patrik, Harald, Cary, Tina and others.&nbsp;&nbsp;I also don&#39;t<br>think we have so far explained it, and the reasons for it, very<br>well, and I&#39;d appreciate the help of others in coming up with a

<br>better explanation.&nbsp;&nbsp;But we have concluded, sadly and painfully,<br>that it is necessary, at least for the short term.<br></div><br><div><span class="gmail_quote">On 6/12/07, <b class="gmail_sendername">Harald Alvestrand

</b> &lt;<a href="mailto:harald@alvestrand.no">harald@alvestrand.no</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>

- We (as in &quot;the community&quot;, not &quot;the editing team&quot;) have experienced<br>that a number of scripts have issues that are not resolved, or not<br>completely resolved, at this time.<br>- For some scripts, we&#39;re pretty certain there are no issues - or,

<br>rather, that the community&#39;s settled down to a specific set of tradeoffs<br>that are unlikely to change.<br></blockquote></div><br>Rule H has no justification in the document; not only that, as Ken points out, Latin, Greek, and Cyrillic are some of the *tougher* cases regarding security, not the easy cases. If one were to pick the scripts to start with in terms of reducing possible security problems, these would not actually be the ones to start with.

<br><br>I have said this before, but the whole way this process is being

handled is not what I am used to in good engineering design. If you

have a set of problems, and are proposing a number of steps that are to

address that problem, you should be able to state, for each of those

steps, an example of the problem that it solves and <span style="font-style: italic;">how </span>it solves that problem. We just don&#39;t see this in the document, nor on this list. They might be private discussions, but I hadn&#39;t thought that was how the IETF was supposed to work...

<br><br>No reason is given for the focus on only

European scripts; and that focus will surely raise suspicions in many circles. While I&#39;m sure that the restriction to European languages is just because those are the ones the small group of authors is familiar with, it will not be received well. If &quot;we the community&quot; have &quot;experienced that a number of scripts have issues that are not resolved&quot;, then those problems should be enumerated *explicitly*, not hidden away.

<br><br>The situation might be different if we were starting from zero; but we are not. We already have an IDNA system that works for a great many people. And while there are security problems with it, those are well known and vendors are dealing with them. Moreover, of the problems that IDNAbis solves, they are just the easy ones -- the harder ones are ones like the &quot;

<a href="http://paypay.com">paypay.com</a>&quot; case, which the current suggestion for IDNAbis doesn&#39;t touch. So it feels like we are looking at a proposal that <br><ol><li>doesn&#39;t actually help much with the practical problems that people face

</li><li>solves the easy problems, but not the hard ones; so people have to essentially do the work anyway</li><li>and removes much of the functionality, except for some favored groups: Europe and the Americas</li></ol>It feels a bit like some Federal agency&#39;s finding that there are some roads without side rails. It decides that because of that security problem, we need to forbid people from using any roads, except of course in New England -- because we know what the roads are like there.

<br><br><br><pre>&gt; 2.4.  Rule D - Ignorables<br><br>   property(cp) is in {Other_Default_Ignorable_Code_Point,<br>                       Noncharacter_Code_Point}</pre><br><span>Noncharacter_Code_Point is never in {Ll, Lu, Lo, Lm, Mn, Mc, Nd}, so this addition is not necessary, any more than many other properties that are also definitionally never in the set (controls, etc.)

<br><br></span><pre>&gt; 3.  Calculation of the derived property</pre>The rules A-G look fine - if we go back to my message of 12/14/06, we see that they match what was there. (I added the correspondence to tables-02 rules in [..] below)

<br><br>0. Start with the empty set. For each code point cp from 0 to 0x10FFFF:<br>[A] 1. If generalCategory(cp) is in {Ll, Lu, Lo, Lm, Mn, Mc, Nd}, add cp<span class="q"><br>[B] 2. If NFKC(cp) != cp, remove cp<br>[C] 3. If casefold(cp) != cp, remove cp

<br></span>[D] 4. If defaultIgnorableCodePoint(cp), remove cp<br>[E] 5. If script(cp) in {Xsux, Ugar, Xpeo, Goth, Ital, Cprt, Linb, Phnx, <span id="st" name="st" class="st">Khar</span>, Phag, Glag, Shaw, Dsrt, Runr}, remove cp

<br>[F] 6. If block(cp) in {Combining_Diacritical_Marks<div id="mb_21">_for_Symbols, Musical_Symbols, Ancient_Greek_Musical_Notation}, remove cp<span class="q"><br>[G] N. If cp is in [-A-Z0-9], add cp<br></span></div><br>

The numbers 1-6,N correspond to A-G in draft-faltstrom-idnabis-tables-02.txt (with the exception of the change in D as noted above).<br><br>While written functionally, this is simply an expression of forming a set by a set of boolean operations. It is simple, because we start with one set, then remove items. At the very end we add back the grandfathered ASCII. When we expanded the values to be Always, Maybe, and Never, it basically had the effect of rewriting to the following (notice also the reversal of A/1 and G/N to be more in line with what is in Patrik&#39;s document);

<br><br>0. Start with the empty set. For each code point cp from 0 to 0x10FFFF:<br><br>Grandfathered<br><div style="margin-left: 40px;"><span class="q">[G] N. If cp is in [-A-Z0-9], put cp in Always</span><br><span class="q">

</span></div><span class="q"><br>Functional Exclusions<br></span><div style="margin-left: 40px;"><span class="q">[B] 2. </span>Else if <span class="q">NFKC(cp) != cp</span>, put cp in Never and stop<br><span class="q">[C] 3. 

</span>Else if <span class="q">casefold(cp) != cp</span>, put cp in Never and stop<br>[D] 4. Else if defaultIgnorableCodePoint(cp), put cp in Never and stop<br><br></div>

Usage Exclusions<br><div style="margin-left: 40px;">

[E] 5. Else if script(cp) in {Xsux, Ugar, Xpeo, Goth, Ital, Cprt, Linb, Phnx, <span id="st" name="st" class="st">Khar</span>, Phag, Glag, Shaw, Dsrt, Runr}, put cp

in Maybe and stop<br>

[F] 6. Else if block(cp) in {Combining_Diacritical_Marks

_for_Symbols, Musical_Symbols, Ancient_Greek_Musical_Notation}, put cp

in Maybe and stop<br></div><div id="mb_21"><br>LMN Inclusion<br><div style="margin-left: 40px;">

[A] 1. Else if generalCategory(cp) is in {Ll, Lu, Lo, Lm, Mn, Mc, Nd}, put cp in Always and stop<br></div><br>Exclude everything else<br><div style="margin-left: 40px;">&nbsp;&nbsp;&nbsp;&nbsp; Z. Else put cp in Never<br><span class="q"></span>

<br></div>But 3. &quot;Calculation of the derived property&quot;. that section is very hard to make out. Moreover, it is impossible to assess what it is supposed to be doing until the difference between Maybe Yes and Maybe No is completely spelled out operationally, and the goal is made clear.

<br><br>And of what I can make out, it looks unpleasant. Many characters are not subject to conditions B-D, which should put them into the Never category.<br><br></div>Mark<br>