<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=ks_c_5601-1987">

<META content="MSHTML 6.00.2900.2995" name=GENERATOR></HEAD>

<BODY>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2>mark,</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2>taking this from the other direction, one might start with 

a pretty limited set(s) of characters (but far more than present use of LDH) 

that are believed to be "safe" and then try to find ways to expand the set(s) 

within the tolerance of safety risk. Plainly there will be differences of 

opinion as to what is "safe enough" - the expressiveness of the characters 

permitted in IDNs should not, in my opinion, be required to have the same degree 

of expressiveness as one would expect in natural written languages. These are, 

after all, computer-based identifiers, technically speaking. Plainly we want 

them to have some linguistic value in the sense that they are memorable, but the 

presence of search, cut/paste, and directories suggests that perfect 

memorability is less critical than, say, global interoperability. 

</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2>I hope no one reads this and thinks I am deliberately 

short-changing the expressiveness side of the equation but I am deeply concerned 

that we appreciate the intended utility of IDNs compared to general multilingual 

discourse.</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2>vint</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006><FONT face=Arial 

color=#0000ff size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=914465017-27112006></SPAN>&nbsp;</DIV>

<DIV>&nbsp;</DIV>

<DIV dir=ltr align=left>

<DIV dir=ltr align=left><FONT face=Arial size=2>Vinton G Cerf</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2>Chief Internet 

Evangelist</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2>Google</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2>Regus Suite 384</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2>13800 Coppermine 

Road</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2>Herndon, VA 20171</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2></FONT>&nbsp;</DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2>+1 703 234-1823</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2>+1 703-234-5822 (f)</FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2></FONT>&nbsp;</DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2><A 

href="mailto:vint@google.com">vint@google.com</A></FONT></DIV>

<DIV dir=ltr align=left><FONT face=Arial size=2><A 

href="http://www.google.com/">www.google.com</A></FONT></DIV>

<DIV dir=ltr align=left>&nbsp;</DIV></DIV>

<DIV>&nbsp;</DIV><BR>

<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>

<HR tabIndex=-1>

<FONT face=Tahoma size=2><B>From:</B> idna-update-bounces@alvestrand.no 

[mailto:idna-update-bounces@alvestrand.no] <B>On Behalf Of </B>Mark 

Davis<BR><B>Sent:</B> Monday, November 27, 2006 12:19 PM<BR><B>To:</B> 

idna-update@alvestrand.no<BR><B>Subject:</B> IDNAbis Goals<BR></FONT><BR></DIV>

<DIV></DIV>In order to assess the advantages and disadvantages of any approach, 

we need to have a good idea of the goals and the weights attached to them. Here 

is an initial take on some of the issues so far discussed, divided into 

categories. <BR><BR>A. Loosen some restrictions on IDNA. The goal is to allow, 

<SPAN style="FONT-STYLE: italic">*where feasible*</SPAN>, the same kind of 

expressive capability in other languages that is now provided for in English. It 

should be recognized that not all reasonable words of every language will 

qualify: even in English the lack of spaces and other punctuation forces 

compromises: words like "can't" are disallowed. <BR><BR>Here is what I've heard 

so far:<BR>

<OL>

  <LI>Allow Unicode 5.0 characters

  <LI>Provide for some mechanism for more quickly updating to successive Unicode 

  versions.<BR>

  <LI>Allow for combining marks at the end of bidi fields 

  <LI>Allow for ZWJ/ZWNJ in limited contexts (see a previous 

message).<BR></LI></OL>Except for #4, which probably most people haven't looked 

through yet, it appears that these are mostly uncontroversial.<BR><BR>B. Tighten 

some restrictions on IDNA. The purpose of this appears to be to reduce the 

opportunity for spoofing. Thus any proposed restrictions should be assessed 

against that metric. That is: (a) does the restriction reduce spoofing 

significantly? (b) Are there no other reasonable mechanisms for doing so? 

<BR><BR>Here is what I've heard so far:<BR>

<OL>

  <LI>Remove (or discourage) symbols and (most) punctuation.

  <UL>

    <LI>This appears to be mostly uncontroversial. While the vast majority of 

    symbols and punctuation do not cause spoofing problems (I⒕NY.com is not a 

    problem, for example), there is not enough value to having them to be worth 

    the effort. </LI></UL>

  <LI>Remove (or discourage) non-spacing marks.

  <UL>

    <LI>This is quite controversial. These marks are needed by many languages; 

    excluding them is like removing vowels from English: "<A 

    href="http://microsoft.com"> microsoft.com</A>" becoming "<A 

    href="http://mcrsft.cm">mcrsft.cm</A>".

    <LI>A very good case has to be made that they (a) cause problems, and (b) 

    those problems can't feasibly be handled with other mechanisms. </LI></UL>

  <LI>Remove (or discourage) archaic / technical characters (characters not in 

  common modern use)<BR>

  <UL>

    <LI>Unicode supplies a proposed list of such characters, in <A 

    href="http://www.unicode.org/reports/tr39/#General_Security_Profile">http://www.unicode.org/reports/tr39/#General_Security_Profile</A>. 

    However, it is recognized that any such list will need refinement and 

    extension in the future.<BR>

    <LI>Certain scripts are quite clearly archaic, and could be easily removed 

    or discouraged. 

    <LI>Judging whether a character in a modern script is archaic, especially 

    those in broad usage such as Latin, Arabic, and Cyrillic, can be quite 

    difficult -- often these characters are pressed into use in minority 

    languages. <BR></LI></UL></LI></OL>A major issue is the choice between removal 

and discouragement. Removal has the very significant cost of breaking backwards 

compatibility, so a clear case has to be made that there is no feasible 

alternative to handle spoofing problems that would otherwise occur.<BR><BR>Mark 

</BODY></HTML>