Q1 is mapping on lookup permanent or transitional?

John C Klensin klensin at jck.com
Sat Apr 4 08:27:43 CEST 2009

--On Wednesday, April 01, 2009 11:26 -0700 Lisa Dusseault
<lisa.dusseault at gmail.com> wrote:

> I'm assuming by "mapping" we now mean that when the
> application has a string that is not valid as an IDNA2008
> label, it tries to help the user arrive at the most
> appropriate valid label by folding case and doing some
> normalization.  Under this definition:
>  - I believe this is permanent; applications never lose the
> need to help the user.

While I agree with the "never lose the need..." principle, in
this particular space, there is actually a thin line between
"help the user" and "persuade the user that the system has DWIM
capability, which will often fail, leaving said user in a state
of complete confusion or serious irritation".  While I do not
see any possibility of addressing the issues in the IDN context
-- even if only because a better solution for one particular
language would foul things up for others using the same script--
the issues that Jefsey and his colleagues have raised about
appropriate case matching for French are actually good examples
of this: if users has learned that most mappings behave in a
reasonable and predictable way, they will expects all
mapping/matching operations to work the way they would
predict... and be confused or irritated when they do not.

And that remains the argument for dealing with mappings as a
localization matter, independent of any issues rooted in
IDNA2003 compatibility and the obvious need to be sure that they
stay local so that they do not interfere with interoperability.
Those may be contradictory goals requiring tradeoffs, but they
nonetheless describe the behaviors that users will expect.

>  - I don't know if we can require it. Certain applications may
> only get valid input, e.g. applications that receive labels
> from the DNS in the first place, display them, and let them be
> chosen for further lookup.  Other applications may have no way
> to confirm a conversion, and prefer to simply reject bad input.

Indeed.  And part of the problem, too, is that, as soon as one
starts accepting some types of bad input (by mapping it or
otherwise), the question of how bad something has to be in order
that a given application will not or cannot make things work
anyway.  Criteria that might strike us as reasonable, including
making only those changes that we can make without any risk of
getting things wrong or reflecting general decisions made for
Unicode, may strike users who do not share our understanding or
perceptions as bizarre or putting the needs of computers ahead
of the needs of people.

>  - I don't know if we can require universally consistent
> conversion to valid labels.  Some regions may adopt variations
> for their own maximum usability experience.

Yes.  See above and the recent "LRI" comments.

>  - The design should be done with at least some independence of
> IDNA2003.  If we were going to IDNA2008 from scratch, what
> would the most appropriate help for users be?  After
> considering that independently, we can consider what tradeoffs
> to make to that conversion definition for best backward
> compatibility with IDNA2003.

I believe this is the right approach.

>  - Is there some way we can guide where this happens?  E.g.
> encourage protocol libraries to have a "mapToValid" method as
> well as a "lookupValid" method that only accepts valid input.
> That way, application implementors get the tools they need to
> see what's going on and choose whether to confirm a suggested
> conversion or fail.

I'm very reluctant to see us add anything to the task list of
this already-late WG, especially anything that might be
controversial, but there is precedent for BCP-style guidance to
implementers.  Or, at a less formal level, we could always add
some words to "Rationale" if we could agree on the words.

As others have pointed out, there are many situations in which
"query the user" is not an option.  I believe that presenting
multiple options, as is typical with a search engine, is in the
"query the user" category, but others disagree about that.  Even
with email, there is a long history that might persuade us that
it is infeasible (although much of that history comes from
disconnected networks).



More information about the Idna-update mailing list