RTL labels and numbers?
Erik van der Poel
erikv at google.com
Wed Oct 14 18:37:01 CEST 2009
Yup, that's why I said that there may be different "kinds" of
inter-label tests. Whether you call them multi-label tests,
inter-label tests, or whatever, at the end of the day, the client is
looking at a number of labels, and the rules depend on the contents of
More importantly, I'm trying to point out that there are different
(1) registering a label
(2) looking up a domain name
(3) displaying a domain name
Traditionally, IETF documents have either refrained from specifying UI
behavior, or have given only rough guidelines (and may have done a
poor job, in some cases).
All I'm saying is that, as deployers and implementers, we have to
consider *all* of these operations. Of course, some clients are only
concerned with operations (2) and (3). But if these implementers do
not follow the same guidelines, we will have problems.
Another example of a "display" problem is the Firefox/Verisign
impasse. Firefox refuses to display IDNs under .com in Unicode form
because Verisign has not published satisfactory rules about
registering IDN labels that might be confusable with others (the
"spoofing", "phishing", "homograph" problem).
Of course, the IETF documents may not address the confusability
problem very extensively, but as a community, we have to address it
one way or another.
And so I'm saying that the current IDNAbis bidi draft rules can be
applied to the display problem.
On Wed, Oct 14, 2009 at 9:18 AM, Andrew Sullivan <ajs at shinkuro.com> wrote:
> On Wed, Oct 14, 2009 at 09:00:12AM -0700, Erik van der Poel wrote:
>> There may be different kinds of "inter-label tests", but the current
>> draft does contain the following:
>> An RTL label is a label that contains at least one character of type
>> R, AL or AN.
>> A "BIDI domain name" is a domain name that contains at least one RTL
>> label. (Note: This definition includes domain names containing only
>> dots and right-to-left characters. Providing a separate category of
>> "RTL domain names" would not make this specification simpler, so has
>> not been done.)
>> The following rule, consisting of six conditions, applies to labels
>> in BIDI domain names.
>> In other words, the implementation is looking at an entire domain
>> name, potentially consisting of multiple labels.
> I was always a little unhappy about that text, but if you read it
> carefully you'll note that there is no need whatever to look at more
> than one label. Moreover, there's no _inter_-label test.
> During WGLC I pointed out that the definition was such that an domain
> name only RTL labels qualified as a bidi name. The response was that
> this is ok.
> The upshot of this is that you can look at each label on its own. If
> you have a single label that contains any character of type R, AL, or
> AN, you're in bidi land and you run the tests on every label.
> More importantly, the rule applies to each label on its own. It is
> not an inter-label test in the sense that it tests anything about the
> labels in respect of one another. The "BIDI domain name" moniker
> merely gathers together all the relevant labels each of which has to
> pass a set of tests. But the labels are tested in isolation.
> Labels that consider the effects of other labels are what I would call
> inter-label tests. We might call the case you're pointing out
> "multi-label tests" in order to note that the same test has to run on
> every label. But any tests that are designed to undo the restrictions
> on numerals would necessarily need to take into account the effects of
> those numerals on other labels, and that's when I have a problem.
> Andrew Sullivan
> ajs at shinkuro.com
> Shinkuro, Inc.
> Idna-update mailing list
> Idna-update at alvestrand.no
More information about the Idna-update