Protocol Action: 'Right-to-left scripts for IDNA' to Proposed Standard

John C Klensin klensin at jck.com
Mon Feb 15 16:34:37 CET 2010



--On Monday, February 15, 2010 07:22 +0000 Shawn Steele
<Shawn.Steele at microsoft.com> wrote:

> We already have a "lot" of variation in different environments
> :(  And they don't always match the user expectations either,
> unfortunately, so this is a nasty problem.
> 
> IMO the best we can do would be to clarify what is expected
> when displaying an IDN/IRI, then at least the "desired"
> behavior is clear.  Unfortunately there may still be
> variations though, but hopefully at least people would be able
> to get a consistent behavior "in the address bar."

Shawn,

I think Mark's explanation is excellent and very helpful.  I
agree with you about the usability issues, although it is not
clear to me that the IETF is the right place to do that work or
even to evaluate it.  I had some significant experience in those
areas in the 70s, but that was a long time ago and, in the IETF,
even that level of experience puts me in a small minority.  Even
that small experience taught me that usability is
culturally-dependent (localization-dependent, if you prefer).
That, in turn, suggests that a globally-optimal solution may
just not be possible.  Label-ordering might be an aspect of
that: it is not at all clear to me that the correct behavior as
perceived by someone who is used to looking at mostly-LTR
strings in an RTL environment will be the same as that perceived
by someone used to an RTL environment but not used to looking at
those strings, much less the same as that perceived by someone
used to an LTR environment only.  I think those are just special
cases of the general issues Mark identified.

Let me take that discussion a little further than you, Mark,
Vint, or Patrik have:   

First, we really cannot consider this question from the
standpoint of the web only, nor from the point of view of user
interfaces in web browsers (e.g., "address bars").   Doing so
is, IMO, part of what has caused this situation, since the
requirements of other protocols and of domain names/ URIs/ IRIs
in running text are different.

More broadly, IETF and W3C made two fundamental decisions a very
long time ago.  One was to permit tremendous flexibility in
different types of URIs with the particular syntax form
depending on the protocol identifier.  The other was to not have
standard delimiters that identify the beginning and end of a
URI.   In combination, it means that identifying the presence of
a URI and where it ends is a tricky business that depends on
heuristics that will not always work.

It is certainly too late to reverse one of those decisions; it
is conceivable that one could reverse the other one (either
generally or for any URI/IRI that contains non-ASCII characters
(%-encoded or directly)), but the hour draws late and the
problem isn't in scope for this WG (and probably isn't in scope
for the IRIbis WG).  If we were to adopt and require some sort
of unambiguous (for all protocol identifiers and all contexts)
"beginning/end of IRI" and maybe "beginning/end of domain name
outside URI/IRI contexts marker or other identification
mechanism, then we could adopt a convention that would be
absolutely consistent in both wire-order and presentation
globally.  It would also provide a solid foundation for
additional localization if that were needed and a way to
distinguish between localized presentation forms and global
forms.

As an alternative, we may need to give up on trying to make IRIs
(and isolated domain names) user-facing, user-friendly
identifiers and move to some sort of "above DNS" object
identifier forms that are more easily adopted to different
cultural-linguistic assumptions.  We have discussed that many
times and haven't gotten much further with it than with
unambiguous "beginning/end" delimiters or the equivalent.

Without that, I definitely believe that we are going to have
trouble in some user presentation contexts.  I fear that
usability studies would help us understand the types and forms
of trouble, but not produce any solutions that could be used
globally.

    john



More information about the Idna-update mailing list