Here&#39;s basically what I said:<br><br><div style="margin-left: 40px;">There are many, many cases of visual confusibles - IPA is not the only or the worst case. Moreover, many IPA characters <i>are</i> used in legitimate alphabets, especially in non-European languages.<br>

<br>For example, there is a draft character picker on my home site, <a href="http://www.macchiato.com/" target="_blank">http://www.macchiato.com/</a>. Even in the common characters, you will see confusibles, like <br><br>

<span style="font-family: tahoma,sans-serif;">ɓ<a href="http://logspot.com">logspot.com</a></span><br><br>where the <span style="font-family: tahoma,sans-serif;">ɓ</span> is <a href="http://unicode.org/cldr/utility/character.jsp?a=0253" target="_blank">http://unicode.org/cldr/utility/character.jsp?a=0253</a><br>

<br>(That is picking Latin from the left, and Common from the center menus. At address-bar sizes, this can easily be confused.)<br><br>And for that matter, if you go to Latin&gt;IPA, you&#39;ll see that

ASCII a-z are also IPA, as well as many others characters from

languages that you&#39;d recognize.<br><br>The working group also rejected sifting for historic characters, but if you go to those you&#39;ll find others, like <a href="http://unicode.org/cldr/utility/character.jsp?a=0185" target="_blank">http://unicode.org/cldr/utility/character.jsp?a=0185</a><br>

<br>The problem simply cannot be solved in the protocol - there are too

many cases where legitimate and illegitimate labels can&#39;t be

distinguished, not without context. And even trying to distinguish them

would take years. Note that the use of NFKC+CaseFolding dramatically

reduces the

opportunities - without those, we&#39;d be much worse off. And yet 2 edge

cases resulting from those (eszett &amp; sigma) have absorbed a huge

amount of time. <i>And that is just for Latin -- there are far trickier issues in many other scripts, or if multiple scripts are allowed</i>.<br><br>The issue of visual confusion is much, much bigger than can be

handled in the protocol - it really takes involvement by the user

agents (browsers, etc) and registries, because they have far more

information available in terms of context and environment.<br><br>That&#39;s why we have put together guidance in:<br><br><a href="http://www.unicode.org/reports/tr36/" target="_blank">http://www.unicode.org/reports/tr36/</a><br>

<br>and data in:<br><br><a href="http://www.unicode.org/reports/tr39/" target="_blank">http://www.unicode.org/reports/tr39/</a><br></div>


<br>Mark<br>