Here&#39;s what I think are the main issues.<br>

<br>

(live document at <a id="publishedDocumentUrl" class="tabcontent" target="_blank" href="http://docs.google.com/Doc?id=dfqr8rd5_50hdnzwmdh">http://docs.google.com/Doc?id=dfqr8rd5_50hdnzwmdh</a>)<br>

<br>

<ol style=""><li>Settle on character repertoire. Basic formulation is ok, but...<br></li><ol><li>Add extensions for stability</li><li>Don&#39;t be Eurocentric: ensure that modern scripts are in ALWAYS (or whatever it is called)

</li><li>Resolve ALWAYS/MAYBE/NEVER problem (see below).<br></li></ol></ol><br style="">There

are other smaller issues, wording of the text, continuing to make

progress on BIDI, and so on, but I think the above are the chief

remaining issues to get consensus on.<br><br>&nbsp;In particular, it sounds like

people would not be adverse to having a separate preprocessing

document, which we think is required, so as long as we do that I&#39;m not

including it here. I have a draft at <a style="" title="Draft IDNA Preprocessing" id="publishedDocumentUrl" class="tabcontent" target="_blank" href="http://docs.google.com/Doc?id=dfqr8rd5_51c3nrskcx">Draft IDNA Preprocessing

</a> for discussion. Of course, that doesn&#39;t mean that we agree on the details for that yet!<br style="font-family: Arial;"><h3 style="font-family: Arial;">ALWAYS/MAYBE/NEVER problem</h3>If we take the very

strong approach that Patrik has currently, where

only Latin, Cyrillic, and Greek are really guaranteed, then over 90

thousand characters are no longer guaranteed to be part of IDNA. I

use the word &quot;not guaranteed&quot; specifically. In Google, for example, we

want to be able to look at a URL and say that it is either compliant to IDNAbis or

not. And we don&#39;t want its compliance to change from TRUE to FALSE according to browser,

or in the future. It&#39;s ok for it to change from FALSE to TRUE in the future, but not the reverse.<br>

<br>

The operational difference between MAYBE and ALWAYS is a problem. <br>

<br>

Let&#39;s take a look at document authoring. If HTML document

authors/generators use hyperlinks with IRIs that contain characters in

the MAYBE category (since registries are allowed to register them, even

if it&#39;s not recommended), then those links would break if any of those

characters became NEVERs (assuming that browsers obey the rule that

NEVERs must not be looked up). That makes perfectly conformant pages

suddenly become non-conformant.<br>

<br>

The structure right now in tables/protocol gives user-agents (including

browers, but also search engines like Google&#39;s) a choice of the frying

pan or the fire:<br>

<br>

<ol><li>If we want stability, only accept ALWAYS. That&#39;s untenable, since we couldn&#39;t handle most of the world.</li><li>If we want to serve our customers, accept ALWAYS+MAYBE. That is instable, since MAYBE could change to NEVER at any time.

</li></ol>

<br>

Programs always have to

be prepared for characters becoming valid that were not in previous

versions. With any new version of

Unicode, a company like Apple, Google, or Microsoft updates its

software to use that version, and characters

become acceptable that were not previously. Everyone who deals with

Unicode needs to be prepared for that. (This situation is a bit like

language/country codes, where new ones arrive on our doorstep -- but

mechanisms like BCP 47 ensure that the old ones never become invalid.)

The key requirement for

stability is that characters that were acceptable in IDNAbis don&#39;t

suddenly become

invalid. If a character (or script) moves from (MAYBE + NEVER) into

ALWAYS in the future, it is not a problem for implementers. Moving from

(MAYBE + ALWAYS) to NEVER is a serious problem.<br>

<br>

My recommendation is

and has

been: permit all characters from all modern scripts. Those are easily

identified, and do not disadvantage any modern language group. It does

not

require an elaborate -- and probably unworkable -- process for getting

buy-in. It would be acceptable to have historic scripts in MAYBE, on

the off chance that there is a successful revival, because it doesn&#39;t

put us in the frying-pan-or-fire position above, since all modern

scripts would be allowed.<br>

<br>

This

protocol is the wrong place to be making fine-grained linguistic

determinations in any event. Restrictions can be imposed by registries

or other

parties, and user-agents where needed. Such restrictions are an

exceeding small problem compared to handling the issues raised by

spoofs like &quot;<a href="http://paypal.com">paypal.com</a>&quot;, and pale in comparison.<br>

<br>

If this approach is argued against, it should be with concrete examples

that can be reviewed and assessed. And the bad cases have to be

sufficient in number to warrant the complexity of the ALWAYS, MAYBE,

NEVER process.<br>

<br style="font-family: Arial;"><br style="font-family: Arial;"><br style="font-family: Arial;"><br style="font-family: Arial;">

<br style="font-family: Arial;">