draft-liman-tld-names-00.txt and bidi

Vint Cerf vint at google.com
Sat Mar 7 12:02:20 CET 2009


Martin, et al,

I would have thought that any notion of all-digit labels would be  
hazardous in the event they lead to confusion with dotted IP address  
notations and would therefore be forbidden?

v


Vint Cerf
Google
1818 Library Street, Suite 400
Reston, VA 20190
202-370-5637
vint at google.com




On Mar 7, 2009, at 1:43 AM, Martin Duerst wrote:

> Hello Andrew,
>
> I'm replying to the list because I think there might be enough
> people interested, at least tangentially.
>
> At 07:11 09/03/07, Andrew Sullivan wrote:
>> Dear colleagues,
>>
>> This is slightly off-topic (although related), but I know some  
>> experts
>> who have thought about this issue are here so I thought I'd better
>> ask.
>>
>> Over on the DNSOP list, we're discussing draft-liman-tld- 
>> names-00.txt.
>> One of the interesting arguments that has cropped up has to do with
>> leading or ending digits on a label.
>>
>> Now, we have some recommendations (and restrictions) on labels in the
>> bidi document, but of course that is something that restricts IDNs,
>> and not A-labels.
>>
>> The question that I have is whether there is a similar bidi issue for
>> A-labels (or, more importantly, non-IDN LDH labels: think of a label
>> "123abc", for instance) in a bidi display or entry context.  I've  
>> been
>> assuming "no" because we already have these sorts of labels today and
>> I imagined whatever is happening now would apply.  But it strikes me
>> that we wouldn't be introducing bidi restrictions if there weren't
>> already a problem.  So is there an issue here that might be relevant
>> to the I-D in question?
>
> You are right that there is a bidi issue. For some very specific
> example, please see Example 11 at
> http://www.w3.org/International/iri-edit/BidiExamples
> (please read the legends or tooltips carefully).
>
> The reason why there are bidi issues is:
> - Non-IDN labels turn up in IDNs
> - Digits get close to RTL characters, maybe only separated by dots
> - In the bidi algorithm, numbers and dots get associated with nearby
>  text and thrown around
>
> Digits between letters of the same directionality get insulated
> from their surroundings, that's why IDNA2003 required RTL letters
> at either end of a label containing RTL characters.
> IDNA2003 did not requre labels with LTR characters to have LTR
> characters at either end, simply because such labels were already
> out there, and because IDNA wasn't in charge of ASCII-only labels.
> However, that doesn't mean that we wouldn't have wanted to prohibit
> them if we had been able to.
>
> For IDNA2008, the situation is slightly different. As far as I
> understand, it doesn't prohibit specific labels, just combinations
> of labels that can cause visual havoc. That means that in some
> situations, a digit at the end of an RTL label may be allowed.
> Turned the other way, it would mean that non-IDN TLDs could be
> created quite freely, but some of them (e.g. those starting with
> a digit) may not allow a second-level RTL label. The details of
> what the restrictions are would have to be calculated using
> Harald's approach. The question of how this is enforced would
> have to be sorted out by people who engage in this kind of thing.
>
>
>> Note that the I-D in question as it stands will not allow all-numeric
>> labels.  But there is a thread of argument that all-numeric labels
>> such as "666" ought to be allowed, on the grounds that such a label
>> could never be part of an IPv4 address anyway.
>
> As far as my experience with Bidi goes, all-numeric labels
> won't be significantly worse than labels with digits at
> either or both ends. What happens in detail may be slightly
> different, but bad things will happen either way.
> There may even be cases where all-digit labels 'perform'
> better, because the digits will stay together and so there
> will be no "jumping the dot" phenomenon for parts of a label.
> But there may still be visual havoc for the overall order
> of labels, and very important, different domain names may
> still lead to the same visual representation (because of
> the bidi reordering). For that, please see Example 10
> in the above page; a logical
>   http://ab.123.CDEFGH/kl/mn/op.html
> will be displayed also as
>   http://ab.123.HGFEDC/kl/mn/op.html,
> same as the logical
>   http://ab.CDEFGH.123/kl/mn/op.html
> in the example.
>
>> If there are bidi
>> issues that are important, then the "no leading digit" rule in the  
>> I-D
>> is strengthened.
>
> It's not only leading digits. It's also trailing digits.
> Trailing digits don't affect standalone domain names
> (or so I think), but domain names often appear in context,
> the most frequent of which is an IRI/URI. The issues here
> are then very much the same as for domain names only, you
> can read about them in the IRI spec (RFC 3987), Section 4.
> The approach taken there is the same as for IDNA2003, but
> instead of 'label', the term 'component' is used in order
> to be more generic. Also, there are no MUSTs, only SHOULDs,
> because it's impossible for IRIs to dictate how their
> components are formed.
>
> We plan to adapt the bidi section of RFC 3987 once IDNA2008
> is more stable.
>
>
> As a summary, from a bidi viewpoint, digits at both ends
> of a TLD label should be prohibited while they still can.
>
> Regards,    Martin.
>
>> Since this isn't strictly on-topic, please send replies off-list.
>> Thanks, and sorry for the diversion.
>>
>> A
>>
>> -- 
>> Andrew Sullivan
>> ajs at shinkuro.com
>> Shinkuro, Inc.
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update



More information about the Idna-update mailing list