RTL labels and numbers?
harald at alvestrand.no
Wed Oct 28 05:51:10 CET 2009
Andrew Sullivan wrote:
> On Wed, Oct 14, 2009 at 09:00:12AM -0700, Erik van der Poel wrote:
>> There may be different kinds of "inter-label tests", but the current
>> draft does contain the following:
>> An RTL label is a label that contains at least one character of type
>> R, AL or AN.
>> A "BIDI domain name" is a domain name that contains at least one RTL
>> label. (Note: This definition includes domain names containing only
>> dots and right-to-left characters. Providing a separate category of
>> "RTL domain names" would not make this specification simpler, so has
>> not been done.)
>> The following rule, consisting of six conditions, applies to labels
>> in BIDI domain names.
>> In other words, the implementation is looking at an entire domain
>> name, potentially consisting of multiple labels.
> I was always a little unhappy about that text, but if you read it
> carefully you'll note that there is no need whatever to look at more
> than one label. Moreover, there's no _inter_-label test.
> During WGLC I pointed out that the definition was such that an domain
> name only RTL labels qualified as a bidi name. The response was that
> this is ok.
> The upshot of this is that you can look at each label on its own. If
> you have a single label that contains any character of type R, AL, or
> AN, you're in bidi land and you run the tests on every label.
> More importantly, the rule applies to each label on its own. It is
> not an inter-label test in the sense that it tests anything about the
> labels in respect of one another. The "BIDI domain name" moniker
> merely gathers together all the relevant labels each of which has to
> pass a set of tests. But the labels are tested in isolation.
Yes, that was the intent.
Looking at it another way:
You can look at one label and know that any domain name using this
component label is a bidi domain name, and therefore running the tests
on all labels that you have access to is guaranteed to make sense.
If the label you're looking at does not contain RTL characters, you can
NOT assume that the whole domain name that is going to get looked up is
a non-BIDI domain name - for all the reasons given.
The tests still look at only one label at a time.
> Labels that consider the effects of other labels are what I would call
> inter-label tests. We might call the case you're pointing out
> "multi-label tests" in order to note that the same test has to run on
> every label. But any tests that are designed to undo the restrictions
> on numerals would necessarily need to take into account the effects of
> those numerals on other labels, and that's when I have a problem.
More information about the Idna-update