Protocol Action: 'Right-to-left scripts for IDNA' to Proposed Standard

Shawn Steele Shawn.Steele at
Mon Feb 15 18:45:13 CET 2010

Another layer would be a HUGE issue.

I agree that this behavior may be locale dependent, however we're getting a tremendous amount of push-back from the Arabic speaking nations and governments to "fix" IE's rendering (currently basically UBA).  The "fix" requested is effectively c.b.a//:http.  (I'm not saying this is "right", just that we need to investigate what's going on.)

My understanding is that even if "consistency" was a trump card, that systems today still aren't consistent.  Apparently in some places the UBA is overridden for display of the URL, and in others it's ignored, so "keeping" UBA still isn't going to cause consistency.  (Although I'm getting tons of feedback, I'm not considering myself a BIDI expert, so I don't have a lot of detail regarding where this works and where it does not. I'm "only" trying to point out that this working group are at odds with the feedback we're getting from the Arabic community & governments.  I'm hearing all the complaints because I'm our "IDN expert." :)

I'm not trying to advocate any particular behavior.  I AM trying to advocate that we don't bother with presentation until such time as we can get clear expectations from as many BIDI communities as possible.

I "get" that there are numerous reasons for why we're in this state, I'm merely pointing out that the users don't care how we got here, or that some layers may be hard to "fix", they're just confused by the current behavior.

Although I'd conceed that different cultural expectations may come in to play, the feedback we're getting from bidi users is reasonably consistent.  (We are trying to ensure that we're getting data from all bidi speaking communities, but most of our feedback is from the Middle East.)

Maybe what we really need is a "display of BIDI IRIs" WG.


From: John C Klensin [klensin at]
Sent: Monday, February 15, 2010 7:34 AM
To: Shawn Steele; Vint Cerf; Mark Davis ☕; Amr Zaki
Cc: Slim Amamou; Michel Suignard; Abdulrahman I. ALGhadir; Aharon \(Vladimir\) Lanin; idna-update at
Subject: RE: Protocol Action: 'Right-to-left scripts for IDNA' to Proposed      Standard

--On Monday, February 15, 2010 07:22 +0000 Shawn Steele
<Shawn.Steele at> wrote:

> We already have a "lot" of variation in different environments
> :(  And they don't always match the user expectations either,
> unfortunately, so this is a nasty problem.
> IMO the best we can do would be to clarify what is expected
> when displaying an IDN/IRI, then at least the "desired"
> behavior is clear.  Unfortunately there may still be
> variations though, but hopefully at least people would be able
> to get a consistent behavior "in the address bar."


I think Mark's explanation is excellent and very helpful.  I
agree with you about the usability issues, although it is not
clear to me that the IETF is the right place to do that work or
even to evaluate it.  I had some significant experience in those
areas in the 70s, but that was a long time ago and, in the IETF,
even that level of experience puts me in a small minority.  Even
that small experience taught me that usability is
culturally-dependent (localization-dependent, if you prefer).
That, in turn, suggests that a globally-optimal solution may
just not be possible.  Label-ordering might be an aspect of
that: it is not at all clear to me that the correct behavior as
perceived by someone who is used to looking at mostly-LTR
strings in an RTL environment will be the same as that perceived
by someone used to an RTL environment but not used to looking at
those strings, much less the same as that perceived by someone
used to an LTR environment only.  I think those are just special
cases of the general issues Mark identified.

Let me take that discussion a little further than you, Mark,
Vint, or Patrik have:

First, we really cannot consider this question from the
standpoint of the web only, nor from the point of view of user
interfaces in web browsers (e.g., "address bars").   Doing so
is, IMO, part of what has caused this situation, since the
requirements of other protocols and of domain names/ URIs/ IRIs
in running text are different.

More broadly, IETF and W3C made two fundamental decisions a very
long time ago.  One was to permit tremendous flexibility in
different types of URIs with the particular syntax form
depending on the protocol identifier.  The other was to not have
standard delimiters that identify the beginning and end of a
URI.   In combination, it means that identifying the presence of
a URI and where it ends is a tricky business that depends on
heuristics that will not always work.

It is certainly too late to reverse one of those decisions; it
is conceivable that one could reverse the other one (either
generally or for any URI/IRI that contains non-ASCII characters
(%-encoded or directly)), but the hour draws late and the
problem isn't in scope for this WG (and probably isn't in scope
for the IRIbis WG).  If we were to adopt and require some sort
of unambiguous (for all protocol identifiers and all contexts)
"beginning/end of IRI" and maybe "beginning/end of domain name
outside URI/IRI contexts marker or other identification
mechanism, then we could adopt a convention that would be
absolutely consistent in both wire-order and presentation
globally.  It would also provide a solid foundation for
additional localization if that were needed and a way to
distinguish between localized presentation forms and global

As an alternative, we may need to give up on trying to make IRIs
(and isolated domain names) user-facing, user-friendly
identifiers and move to some sort of "above DNS" object
identifier forms that are more easily adopted to different
cultural-linguistic assumptions.  We have discussed that many
times and haven't gotten much further with it than with
unambiguous "beginning/end" delimiters or the equivalent.

Without that, I definitely believe that we are going to have
trouble in some user presentation contexts.  I fear that
usability studies would help us understand the types and forms
of trouble, but not produce any solutions that could be used


More information about the Idna-update mailing list