Proposed new Firefox IDN display algorithm

Tue Feb 7 07:35:00 CET 2012

--On Tuesday, February 07, 2012 02:02 +0100 "J-F C. Morfin"
<jfc at morfin.org> wrote:

> At 21:24 06/02/2012, John C Klensin wrote:
>> Gerv,
> 
> John,
> 
> as long as we are talking of a Mozilla-IPI (International
> Plug-In) that can be used with a transparent Firefox and every
> other browser when in transparent mode, so we get the same
> result whatever the browser or we can use whatever other IPI
> approved by the local ISOC Chapter, ICANN, the national
> university association, ZDNet, the national ccTLD, etc. with
> Firefox there is no problem with IUsers.

I actually wasn't talking about that.  My reason is one of those
in which we may converge in the extreme case even if we disagree
on most of the details (I'm not sure we do).   

I'm quite certain that there is no perfect solution to the
problems and alternatives to drive Gerv's policy (and other
policies in other browsers).   I think that the classic remedy
of "a good compromise is one that makes everyone equally
unhappy" is not a good solution in this type of human interface
situation.   And, while I really like the idea of Gerv and his
colleagues that every version of Firefox, on every platform and
in every language adaptation, should behave the same way wrt
IDNs, I'm not sure that is the most important objective.  In
particular, I think it is possible that ability to localize and
to adjust to different user usage pattern may be more important.

I'm also really, really afraid of the possible consequences of
widespread appearance appearance of "????" or other "tried to
display that and couldn't" situations.  I think that many of the
people who are concerned about confusion among characters are
paying too little attention to that one.

As a result, my preference is that:

(1) Different browsers try the ideas that they think will work
best so that we can all compare, ideas that are clearly good can
gradually spread, and, if it turns out that there are only
tradeoffs, users can make choices based on what suits their
needs and matches their taste.  Coming up with a universal
solution (or even a clear definition of a "transparent mode") at
this point seems to me to require knowledge that none of us
really have, independent of whether our guesses and hypotheses
agree or disagree.

(2) I deliberately didn't mention it in my long note but, from a
UI design standpoint, I'd like to see Firefox do two additional
things.  

One is to provide a switch that permits a user to say "I think
I'm smarter than you are and am willing to take responsibility
for that belief and its consequences".  If set in this case (it
should obviously be off by default), it should simply disable
all of the "display algorithm" stuff, causing the browser to
display whatever it can in native character form.  From my point
of view, similar switches in other browsers to disable _their_
algorithms and approaches to the problem would be a good idea
too.  For reasons that I think I understand, I don't expect
Mozilla to provide that switch (or perhaps to provide it and
make it hard for any but the most sophisticated users to find),
but I still think it would be a good idea.

The second, and even more important, is that I believe the
browser should provide a very accessible, very easy-to-use,
transcoder for these labels.  The best UI may differ among
browsers and platforms, but, as an example for desktop machines,
I'd like to be able to right-click on a domain name or label or
even highlighted/ selected string and have all three of U-label
form, A-label form, and a list of Unicode code points (in U+NNNN
or \u'NNNN' form) easily available.  For the user who does know
what is going on (or who can learn) that particular
tool/facility is likely to be far more useful in the long run
than any collection of "we know more about this than you do and
are here to protect you" tools.  For what I think Gerv believes
is the more typical user (and he is probably right) such a tool
is, at worst, another feature like "display source" that will
never be used.

If I have to copy a string and carry it to another tool, the
value of the approach goes down significantly, not just because
the inconvenience might discourage me from checking strings I
ought to check, but because the uncertainties of copy-and-paste
operations might yield false results.

As a trivial, ASCII-only, example, if I see "rn" on a small
screen and in poor light, it would be a huge advantage to be
able to be able to get the browser to show me code points that
would tell me if I'm looking at U+006D or at U+0072 U+006E.
Similar examples for far more complex IDN cases should be
obvious.  For that set of examples, the code point list is
likely to be a lot more useful than an A-label.  There will
likely be other cases where the A-label (or the U-label if the
A-label is displayed) will be more useful.  FWIW, such a
facility will, IMO, become even more important if we start
seeing wide deployment of non-trivial IRIs (and, for them,
%-encoded UTF-8 needs to be on the list of forms that can be
seen or obtained too).

> This is why I suggest that your mail is published as an
> extension of RFC 5895.

I actually don't think it has much to do with 5895.  Independent
of that, if others think it is useful enough, I could certainly
put it together either as an I-D  that might lead to an RFC or
as an article for publication elsewhere.   At least at this hour
of the night, I'd guess the latter might be more appropriate, if
only because the IETF and the RFC Series rarely go that far down
the path toward UI design.

best,
    john