No subject


Tue Nov 18 23:43:20 CET 2008


lookup unassigned characters. Put yourself in the shoes of a Web
search engine developer. You can update your crawler to support the
latest version of IDNA. However, you cannot get all of your users to
update their browsers very quickly. This is why Google emits URIs with
Punycode (because MSIE6 does not perform IDNA).

Now, a document that matches the user's search query very well is
typically pushed up to the top of the first page of search results. If
that document's URI happens to use newer Unicode characters, the
user's browser may not be able to convert such Unicode labels to ASCII
labels, and so it would be great if the search engine would perform
the ASCII conversion.

Of course, the old browser may not be able to display new Unicode
characters either. So it would be prudent for the search engine to
refrain from displaying the Unicode characters directly. Instead, it
might present a small link to a warning page that explains why the URL
hasn't been displayed. Likewise, the browser might refrain from
displaying new Unicode characters too.

Also, this could be abused by phishers who try to collect passwords
and the like from unsuspecting users. This is part of the broader
phishing problem, which can be attacked in a number of different ways,
including careful display, user education and services that warn users
that particular URIs have been discovered to be at phishing sites.

Erik


More information about the Idna-update mailing list