What rules have been used for the current list of codepoints?

Erik van der Poel erikv at google.com
Fri Dec 15 19:35:55 CET 2006


On 12/15/06, Gervase Markham <gerv at mozilla.org> wrote:
> In many applications of IDNA, the display technology will not know the
> language(s) the user understands. The side of a bus is one obvious
> example

The side of a bus is not a security issue. It isn't a serious
interoperability issue either.

> but even various computer applications may not know. Today's
> email clients don't generally know.

Most email clients have a user interface in a single language. That
would be the default language to use for IDN display. Some clients may
even allow the user to specify multiple languages, as browsers do.

> It is also true that getting users to understand and take heed of even
> the simplest security-related UI is a hard battle, and therefore what
> they must pay attention to must be the absolute minimum necessary.

I agree. A shaded rectangle is quite minimal.

> Therefore, an IDNA system which relies for its safety on the client
> alerting the user to "unfamiliar characters", and the user noticing the
> alert and taking action, is dangerous.

True. That's why we have blacklists of dangerous URLs like:

http://www.paypal-com.secure-login.mx23.cc/

And this is not even an IDN.

> (Such a system also discourages the uptake of IDN, because the owner of
> an IDN domain name cannot know what proportion of his customers will see
> a scary warning message when they try and visit his site. But that's a
> different point to the security one.)

Domain owners are free to register multiple domain names, one for each
language that his customers read. So the owner inserts the English
domain name in the English ad, the Japanese domain name in the
Japanese ad, and so on.

> The way to avoid spoofs is, instead, to cut down the set of characters
> as far as is reasonably possible and then to require registries to have
> policies which do not issue two confusable domain names to different
> entities. There are several technical and logistical ways of achieving
> this. I see no reason why a registry would object to having a policy
> which prevents some of its customers defrauding other of its customers.

Yes, that is another way to avoid spoofs. Maybe I shouldn't have said
"THE way to avoid spoofs" in my email. Sorry. It will be interesting
to see which ways the TLDs adopt in the long run (if at all).

> This is the policy that Firefox adopts; we currently display IDN domain
> names for around 30 TLDs, the registries for all of which have
> anti-spoofing policies. Any registry with such a policy is welcome to
> ask to be included in the list, and we will ship the list change in our
> next security update.

I noticed that Firefox does not display Japanese .com names for me
even though I have Japanese in my list of languages (since I can read
it). However, Microsoft IE7 displays that name for me. Firefox is
still a relatively minor player. It will be interesting to see whether
the Firefox tail can wag the Verisign/Microsoft dog.

> > Japanese domain names would not be shown to most American users, and
> > that's OK because they can't read them anyway!
>
> That's rather a naive and sweeping statement.

Calm down. I said "most", and it's true that most Americans cannot
read Japanese.

> Are there no Japanese
> Americans? Are there no Japanese visiting America and using Internet
> cafes? Are there no Americans learning Japanese? Must all of these
> people reconfigure every browser and other IDNA-aware client they use to
> tell it all the languages they can read? And, in the case of a client
> they are using temporarily, configure it back afterwards to avoid
> putting others at risk?

We are still at the very beginning of the adoption of IDNA, and it may
not ever truly catch on, but I suspect that Internet cafes will
eventually make it easier for users to change the language of the user
interface, including the browser's. Novice users may try to conduct
sensitive financial transactions in Internet cafes when the user
interface is in an unfamiliar language. We need to teach users to be
careful out there. It's an education process, as we have all said
several times.

> > And most Americans are not interested in doing business with a paypal
> > that has one of the letters in Cyrillic, so you just don't show them
> > the spoofed name. You show them a warning instead.
>
> This is the mixed-script question, not the unfamiliar character
> question. But, to use your example, if payp<cyrillic-a>l.com gets issued
> to someone other than the owner of paypal.com, then that is squarely the
> responsibility of the .com registrar, and they should be taken to task
> for it. It should not need to be the user's responsibility to avoid
> being taken in.

True. It will be interesting to see what happens with Cyrillic domain
names in the long run. There is a company called uralweb where the
"ural" is in Cyrillic and the "web" is in Latin. Maybe the .ru and
.com registries will allow certain script mixtures, as long as it is
quite clear which part is Cyrillic and which part is Latin (unlike the
paypal spoof).

Interesting stuff. I'm going to wait and see...

Erik


More information about the Idna-update mailing list