<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.6000.16674" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY
style="WORD-WRAP: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space"
bgColor=#ffffff>
<DIV><FONT face=宋体 size=2></FONT> </DIV>
<DIV><FONT face=宋体 size=2></FONT> </DIV>
<DIV><FONT face=宋体><FONT size=2>if <FONT face=Arial>SWORD's verbal search
algorithms (or any other algorithms) can be used to built a similarity
words set database, that seems be fine. </FONT></FONT></FONT></DIV>
<DIV><FONT face=Arial size=2>what I mean is that:</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>For every possible domain label word or TLD word,
we can classify it into the sets of similarity words , finally we can
bulit a database including all possible similarity word sets.</FONT></DIV>
<DIV><FONT face=Arial size=2> </FONT></DIV>
<DIV><FONT face=Arial size=2>so there will have many similarity words
sets</FONT></DIV>
<DIV><FONT face=宋体 size=2>for example,</FONT></DIV>
<DIV><FONT face=Arial size=2> similarity word A set (every word in
this set is similar to word A)</FONT></DIV>
<DIV><FONT face=Arial size=2>similarity word B set (every word in this set
is similar to word B)</FONT></DIV>
<DIV><FONT face=Arial size=2>similarity word C set(every word in this set
is similar to word C)</FONT></DIV>
<DIV><FONT face=Arial size=2>similarity word D set(every word in this set
is similar to word D)</FONT></DIV>
<DIV><FONT face=Arial size=2>...</FONT></DIV>
<DIV><FONT face=Arial size=2>...</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>When new word X is encountered by SWORD's verbal
search algorithms, this algorithm can decide whether word X can be classified
into current similarity word sets. if yes, we will add word X into the
current similarity word set; if not, we can create a new similarity word
set.</FONT></DIV>
<DIV><FONT face=Arial size=2>if this process is repeated, the similarity word
set will become larger and the database including all the similarity
word sets will become larger.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>This database may help us to decide whether new
gTLD strings are in user confusion with existing TLDs. It can also help the
registry or registrant or registrar to register IDN.</FONT></DIV>
<DIV><FONT face=Arial size=2>Of course, that kind of database is not easy to be
built.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>YAO Jiankang</FONT></DIV>
<DIV><FONT face=Arial size=2>CNNIC</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<BLOCKQUOTE
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style="FONT: 9pt 宋体">----- Original Message ----- </DIV>
<DIV style="BACKGROUND: #e4e4e4; FONT: 9pt 宋体; font-color: black"><B>From:</B>
<A title=vint@google.com href="mailto:vint@google.com">Vint Cerf</A> </DIV>
<DIV style="FONT: 9pt 宋体"><B>To:</B> <A title=idna-update@alvestrand.no
href="mailto:idna-update@alvestrand.no">idna-update@alvestrand.no</A> </DIV>
<DIV style="FONT: 9pt 宋体"><B>Sent:</B> Saturday, August 09, 2008 8:02 PM</DIV>
<DIV style="FONT: 9pt 宋体"><B>Subject:</B> an interesting ICANN development on
similar domain names</DIV>
<DIV><BR></DIV>
<P style="MARGIN: 0px 0px 10px"><FONT style="FONT: 10px Arial" face=Arial
size=2><B>tring Similarity Algorithm Update</B> -- ICANN staff recently
completed a workshop with SWORD, the partner who is assisting ICANN with the
creation of an algorithm that will help automate the process for assessing
similarity among proposed and existing TLD strings. SWORD's verbal search
algorithms are used by various patent and trademark offices throughout the
world. SWORD has completed a beta algorithm and reviewed several test cases
with ICANN staff. This is being done in order to refine the parameters and
discuss how the algorithm could be successfully integrated as a tool to help
implement the GNSO's recommendation that new gTLD strings should not result in
user confusion with existing TLDs.</FONT></P>
<P>
<HR>
<P></P>_______________________________________________<BR>Idna-update mailing
list<BR>Idna-update@alvestrand.no<BR>http://www.alvestrand.no/mailman/listinfo/idna-update<BR></BLOCKQUOTE></BODY></HTML>