Mapping?

Lisa Dusseault lisa.dusseault at gmail.com
Wed Dec 2 16:06:39 CET 2009


I'd like to try to unpack some of the different use cases we're
talking about a little more.

ISTM that use cases where the person following the link is the person
who is typing it in, are use cases that locale-dependent mapping might
be most useful.  If I'm in a locale where Ȱ (x230) is considered to be
the capitalized version of o (ASCII o),  it might very well be most
helpful to make that mapping.  Use cases where the same user is typing
in the domain names that then looks them up include:
 - typing links in the address bar
 - typing mail address in the To field of an email
 - Writing a Web page, blog post or email, wherein I check that the
links work before posting/sending my document

In contrast, the use case where the person looking up the domain
FȰȰ.example is not the person who typed it in, then in most cases we
no longer know the intent or locale of the person who typed in the
domain.  It may be the same locale as the person who is looking up the
domain but it may not be.  The person who typed in the domain may have
intended fȱȱ.example or foo.example, and may have tested that before
sending/posting the link, but we no longer have that information.  Use
cases include:
 - Following a HTTP link in any Web page, document, blog post, email, etc
 - Using a mailto link (explicit or implicit), e.g. when one person
sends me another person's email address

We probably would all agree that people follow links while Web
browsing far more often than they type them in, and even when typing
in, auto-complete probably drastically reduces the new cases of
from-scratch mapping and lookup.

However, we probably have quite different assumptions about how much
Internet activity takes place among users of a consistent locale.  Can
we assume that Patrik wants ß interpreted as ß because he communicates
mostly in Swedish with Swedish users and mostly reads Swedish Web
pages?  Or must we assume that Patrik also gets email from german and
swiss senders, and also reads Web pages (perhaps in English!) written
by German users who expected different mappings?  I am sure this
depends heavily on our model of a user, and whether we're using
ourselves as hypothetical examples or not.

One slightly more solid question for browsers is, would it be entirely
crazy to have different mapping algorithms for typed-in domain names
than for links followed?  There might be a locale-dependent mapping as
well as a global mapping.  (I assume that having every established
locale mapping installed would be complete craziness.)

Another question is: when posted links are followed, how often do we
know the locale where the link was authored?  Not that the browser
following the link would necessarily be able to apply the mappings of
the locale in which it was authored, but would it be slightly better
to apply a global mapping than a mapping from a different locale?

Do any authoring software clients fix up links as the user types?
When I type a link in a document, the authoring software often makes
that link active.  Is there any software that automatedly lower-cases?
 If so, would such software also be likely to map to PVALID characters
before the doc is finished?

Lisa

On Tue, Dec 1, 2009 at 12:45 PM, Shawn Steele
<Shawn.Steele at microsoft.com> wrote:
>
> One example I discussed with Patrik yesterday, was whether locale
> might affect mapping. I'd like to get better insight into the general
> understanding of that.
>
>> 1. Could locale determine whether a PVALID character should be mapped
>> into another PVALID character prior to following the rules to turn
>> into an ALABEL?  I believe the consensus answer is probably SHOULD NOT
>> or MUST NOT because that would make domains with that valid character
>> unreachable by software using those locale rules.
>
> I agree.
>
>> 2. Could locale determine whether, or how, a DISALLOWED character is
>> mapped into a PVALID character prior to getting an ALABEL?
>
> No, for several reasons:
>
> A) If I email you a link that contains a DISALLOWED character, your machine/environment MUST map it to the same thing my machine did.  Otherwise I say "you have funny charges from travelling, visit Bank.org to correct it."  You are trying to pay for your flight home so you type "Bank.org" into the computer in the kiosk in the foreign airport, and if it uses different mapping rules you could end up as a phishing site.  You don't want VISA.com to go to a vısa.com just because you're using a Turkish airport browser.
>
> B) If I travel myself, I need consistent behavior regardless of the machine I'm using.
>
> C) If I see an international advertisement, the domains need to go to the same server, regardless of who and how and where the person is typing in the link.
>
> D) A server or relay wouldn't necessarily know the context the user expected when interpreting a forwarded request.
>
> E) It'd be a support nightmare.
>
> F) I'm not sure if it is practical to create APIs that enable this distinction.  (We (software community, not just my company) already have problems selecting the correct locale specific behavior for sorting and formatting, etc., so we'd be bound to get it wrong at least some of the time.)
>
> -Shawn
>


More information about the Idna-update mailing list