I think that is overly complicated. The consensus is to have a required mapping step for compatibility in Lookup (and forbid mapping in Registration). Moreover, we have the strong statement from DENIC that (a) they prefer mapping for compatibility, and (b) if there is a mapping, then they want the mapping of eszett to be in accordance with IDNA2003. And I believe (although it is not completely clear), that the Greeks feel the same about sigma.<br>

<br>Given that, the only changes in the structure of the mapping of IDNA2003 is to allow ZWJ/ZWNJ. We can fit these pieces together much more simply, and without even lugging around an IDNA2003 implementation (which we&#39;d like to get rid of), or lugging around a Unicode 3.2 implementation. Moreover, two lookups are <i>only</i> required when a domain name contains at least one ZWJ or ZWNJ.<br>


<br><br>Here is a proposal on that basis:<br><br><a href="http://tools.ietf.org/html/draft-ietf-idnabis-protocol-11">http://tools.ietf.org/html/draft-ietf-idnabis-protocol-11</a><br><br>In 4.1, replace:<br><br><div style="text-align: left; margin-left: 40px;">

  Entities responsible for<br>   zone files (&quot;registries&quot;) are expected to accept only the exact<br>   string for which registration is requested, free of any mappings or<br>   local adjustments.  They SHOULD avoid any possible ambiguity by<br>

   accepting registrations only for A-labels, possibly paired with the<br>   relevant U-labels so that they can verify the correspondence.<br></div><br>by<br><div style="margin-left: 40px;">  Entities responsible for<br>   zone files (&quot;registries&quot;) MUST only accept only U-Labels or A-Labels. <br>

  They SHOULD avoid any possible ambiguity by<br>   accepting registrations only for A-labels, preferably paired with the<br>   corresponding U-labels so that registrants can verify the identify of the labels.<br></div><br>

Replace Section 5.3 by <br><br><div style="margin-left: 40px;">5.3. Character Transformations<br><br>The Unicode string MUST be transformed according to the specifications in 5.3.1 and 5.3.2. These transformations are designed to allow for mapping compatibility with IDNA2003 without requiring Unicode 3.2 implementations. Even with these transformations, however, there are many characters that are allowed in IDNA2003 that are not allowed by IDNA2008. Implementations may use one of the techniques described in Appendix A to deal handle such domain names during a transitional period.<br>

<br>It is important to note that labels in application protocols, files, or links SHOULD BE in U-label or A-label form.<br><br><br>5.3.1 Normalization and Casefolding<br><br>The Unicode string MUST be transformed by normalizing with Unicode normalization form KC, then case folding, then normalizing again. This guarantees that none of the resulting characters in the string are Unstable according to the criterion in Tables Section 2.2. Unstable (B). In pseudocode:<br>

<br>   string = toNFKC(toCaseFold(toNFKC(string)));<br><br>Example: &lt;A, U+0300 COMBINING GRAVE ACCENT&gt; is transformed into &lt;U+00C0 ( À ) LATIN CAPITAL LETTER A WITH GRAVE&gt;.<br><br><br>5.3.2 Removal of Ignorables<br>

<br>Certain Unicode characters are called Default_Ignorable_Code_Points. For more information, see Tables Section 2.3. IgnorableProperties (C). The Unicode string MUST be transformed by removing all Default_Ignorable_Code_Points characters except for the Join Controls specified in Tables Section 2.8. JoinControl (H). In pseudocode:<br>

<br>   string = removeAll(string, Default_Ignorable_Code_Point - Join_Control)<br><br>Example: &lt;A, U+00AD (  ) SOFT HYPHEN, B&gt; is transformed into &lt;A, B&gt;.<br></div><br>Replace Appendix A by<br><br><div style="margin-left: 40px;">

Appendix A. Transitional Techniques<br><br>Registries should support IDNA2008 as soon as possible, and no longer support registration of any labels that are only valid in IDNA2003. In Lookup, on the other hand, many implementations will need to provide backwards compatibility for IDNA2003 labels during some transitional period. These IDNA2003 labels will typically contain a symbol or punctuation mark that is not allowed under IDNA2008, such as &quot;I&lt;heart&gt;NY&quot;.<br>

<br>The following describes a technique for modifying the lookup process to deal with that situation. There are two cases to be handled, according to whether the labels in the domain name pass the tests of Section 5. Note that two lookups are <i>only</i> required when a domain name contains at least one ZWJ or ZWNJ.<br>

<br>Case 1. The labels pass the tests of Section 5 (typically all M-Labels)<br><ul><li>Perform the lookup with the corresponding XN-Labels </li><li>If it fails, and it contains any ZWJs or ZWNJs, remove them and perform the lookup with the result.</li>

<li>If that fails, stop with an error.</li></ul>Case 2. The labels don&#39;t all pass the tests of Section 5 (typically at least one non-M-Label)<br><ul><li>Transform each label according to Section 5.3. In addition, remove any  ZWJs or ZWNJs.</li>

<li>If the string contains any unassigned Unicode characters, stop with an error.</li><li>If the corresponding XN-Label contains any characters prohibited by IDNA2003 (<a href="http://www.ietf.org/rfc/rfc3454.txt">http://www.ietf.org/rfc/rfc3454.txt</a> Section C. Prohibition tables),

stop with an error.</li><li>Perform the lookup with that XN-Label.</li></ul>The conditions in Case 2 are slightly different than for IDNA2003, but avoid having to retain a complete IDNA2003 implementation: only a small table of prohibited characters needs to be retained. Alternatively, an IDNA2003 implementation can be used in a modified Case 2.<br>

</div><br>[Ed note: the reason I have the somewhat clumsy language &quot;typically...M-Labels&quot; is that we don&#39;t guarantee that what results from Section 5 (Lookup) is actually a M-Label, because Section 5.3 doesn&#39;t guarantee U-Labels.]<br>

<br><br>

Add to Defs:<br><div style="margin-left: 40px;">An M-Labels is a Unicode String whose transformation according to section 5.3 of Protocol results in a U-Labels.<br></div><br>Mark<br><br><br><br>On Mon, Mar 30, 2009 at 04:41, Vint Cerf &lt;<a href="mailto:vint@google.com">vint@google.com</a>&gt; wrote:<br>

&gt; There has not been any significant objection to the proposals made<br>&gt; during the IETF 74 meeting to apply some form of mapping during<br>&gt; lookup. The two questions outstanding are:<br>&gt;<br>&gt; 1. what mapping function should be used?<br>

&gt; 2. how should it be used<br>&gt;<br>&gt; As Harald and others have observed, if it is applied before an<br>&gt; IDNA2008-style lookup, we will not find new characters permitted under<br>&gt; IDNA2008 if they happen to be mapped under IDNA2003. This seems to<br>

&gt; argue for:<br>&gt;<br>&gt; 1. first look up under IDNA2008 rules<br>&gt; 2. If a domain name is found, return the corresponding results<br>&gt; 3. If a domain name is not fund, apply IDNA2003 mapping<br>&gt; 4. If a domain name is found, return the results<br>

&gt; 5. If a domain name is not found, report that no such domain name exists<br>&gt;<br>&gt; One final point. It seems to me that we should put the IDNA2003<br>&gt; mapping function into stasis, making no future changes to it, and use<br>

&gt; the IDNA2008 framework to accommodate any new additions into Unicode<br>&gt; versions as they are released. Assuming we have ample warning of a new<br>&gt; version, we can even prepare tables suited to the new release ahead of<br>

&gt; time so as to have them available at the point where a new version of<br>&gt; Unicode is adopted.<br>&gt;<br>&gt; Could the WG please analyze this proposition, point out flaws and<br>&gt; suggested corrections for them?<br>

&gt;<br>&gt; thanks<br>&gt;<br>&gt; vint<br>&gt;<br>&gt;<br>&gt; Vint Cerf<br>&gt; Google<br>&gt; 1818 Library Street, Suite 400<br>&gt; Reston, VA 20190<br>&gt; 202-370-5637<br>&gt; <a href="mailto:vint@google.com">vint@google.com</a><br>

&gt;<br>&gt;<br>&gt;<br>&gt;<br>&gt; _______________________________________________<br>&gt; Idna-update mailing list<br>&gt; <a href="mailto:Idna-update@alvestrand.no">Idna-update@alvestrand.no</a><br>&gt; <a href="http://www.alvestrand.no/mailman/listinfo/idna-update">http://www.alvestrand.no/mailman/listinfo/idna-update</a><br>

&gt;<br><br>