<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">


<HTML><HEAD>


<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">


<META content="MSHTML 6.00.6000.16674" name=GENERATOR>


<STYLE></STYLE>


</HEAD>


<BODY 


style="WORD-WRAP: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space" 


bgColor=#ffffff>


<DIV><FONT face=&#23435;&#20307; size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=&#23435;&#20307; size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=&#23435;&#20307;><FONT size=2>if <FONT face=Arial>SWORD's verbal search 


algorithms&nbsp;(or any other algorithms) can be used to built a similarity 


words set database, that seems be fine. </FONT></FONT></FONT></DIV>


<DIV><FONT face=Arial size=2>what I mean is that:</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2>For every possible domain label word or TLD word, 


we can classify it into the&nbsp; sets of similarity words , finally we can 


bulit a database including all possible similarity word sets.</FONT></DIV>


<DIV><FONT face=Arial size=2>&nbsp;</FONT></DIV>


<DIV><FONT face=Arial size=2>so there will have many similarity words 


sets</FONT></DIV>


<DIV><FONT face=&#23435;&#20307; size=2>for example,</FONT></DIV>


<DIV><FONT face=Arial size=2>&nbsp;similarity word&nbsp;A set (every word in 


this set is similar to word A)</FONT></DIV>


<DIV><FONT face=Arial size=2>similarity word&nbsp;B set (every word in this set 


is similar to word B)</FONT></DIV>


<DIV><FONT face=Arial size=2>similarity word&nbsp;C set(every word in this set 


is similar to word C)</FONT></DIV>


<DIV><FONT face=Arial size=2>similarity word&nbsp;D set(every word in this set 


is similar to word D)</FONT></DIV>


<DIV><FONT face=Arial size=2>...</FONT></DIV>


<DIV><FONT face=Arial size=2>...</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2>When new word X is encountered by SWORD's verbal 


search algorithms, this algorithm can decide whether word X can be classified 


into current similarity word&nbsp; sets. if yes, we will add word X into the 


current similarity word&nbsp;set; if not, we can create a new similarity word 


set.</FONT></DIV>


<DIV><FONT face=Arial size=2>if this process is repeated, the similarity word 


set will become larger and the database including all the similarity 


word&nbsp;sets will become larger.</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2>This database may help us to decide whether new 


gTLD strings are in user confusion with existing TLDs. It can also help the 


registry or registrant or registrar to register IDN.</FONT></DIV>


<DIV><FONT face=Arial size=2>Of course, that kind of database is not easy to be 


built.</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2>YAO Jiankang</FONT></DIV>


<DIV><FONT face=Arial size=2>CNNIC</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>


<BLOCKQUOTE 


style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">


  <DIV style="FONT: 9pt &#23435;&#20307;">----- Original Message ----- </DIV>


  <DIV style="BACKGROUND: #e4e4e4; FONT: 9pt &#23435;&#20307;; font-color: black"><B>From:</B> 


  <A title=vint@google.com href="mailto:vint@google.com">Vint Cerf</A> </DIV>


  <DIV style="FONT: 9pt &#23435;&#20307;"><B>To:</B> <A title=idna-update@alvestrand.no 


  href="mailto:idna-update@alvestrand.no">idna-update@alvestrand.no</A> </DIV>


  <DIV style="FONT: 9pt &#23435;&#20307;"><B>Sent:</B> Saturday, August 09, 2008 8:02 PM</DIV>


  <DIV style="FONT: 9pt &#23435;&#20307;"><B>Subject:</B> an interesting ICANN development on 


  similar domain names</DIV>


  <DIV><BR></DIV>


  <P style="MARGIN: 0px 0px 10px"><FONT style="FONT: 10px Arial" face=Arial 


  size=2><B>tring Similarity Algorithm Update</B> -- ICANN staff recently 


  completed a workshop with SWORD, the partner who is assisting ICANN with the 


  creation of an algorithm that will help automate the process for assessing 


  similarity among proposed and existing TLD strings. SWORD's verbal search 


  algorithms are used by various patent and trademark offices throughout the 


  world. SWORD has completed a beta algorithm and reviewed several test cases 


  with ICANN staff. This is being done in order to refine the parameters and 


  discuss how the algorithm could be successfully integrated as a tool to help 


  implement the GNSO's recommendation that new gTLD strings should not result in 


  user confusion with existing TLDs.</FONT></P>


  <P>


  <HR>


  <P></P>_______________________________________________<BR>Idna-update mailing 


  list<BR>Idna-update@alvestrand.no<BR>http://www.alvestrand.no/mailman/listinfo/idna-update<BR></BLOCKQUOTE></BODY></HTML>