numeric (ascii) labels (was: Re: draft-liman-tld-names-00.txt and bidi)

Vint Cerf vint at google.com
Tue Mar 10 03:34:26 CET 2009


eric,

given the need to program something to answer the question "could this  
dotted thing possibly be interpreted as an IP address" and the  
alternative "this could not possibly be an IP address because it is  
not all-numeric" I would incline towards what I think is the easier to  
program and not very restrictive choice that simply says:

1. no TLD can be all-numeric ASCII
2. leading (and trailing ?) digits are disallowed in labels

I gather you favor allowing all-numeric TLD labels. How does the rest  
of the WG see this?

v




Vint Cerf
Google
1818 Library Street, Suite 400
Reston, VA 20190
202-370-5637
vint at google.com




On Mar 9, 2009, at 10:19 PM, Eric Brunner-Williams wrote:

> Vint,
>
> What is the case that  
> 141592653589793238462643383279502884197169399375105820974944592. is  
> "bad". Lyman's probably got cases out to 8 digits in the label, but  
> how about the rest?
>
> The "no nums without alphas" nonsense comes from the current ICANN  
> attempt to specify what can go into the IANA root. I don't mind  
> random gorp passing as good policy, but things that are represented  
> as technical requirements MUST, in the usual sense, be correct or  
> subject to correction. Pi may break my calorie budget, but it isn't  
> going to break root.
>
> Finally, as chair, what you think is easier to implement is  
> interesting, but is it sufficient, or necessary, to characterize  
> specification? Having stuffed bits of P3P evaluation mechanism (lots  
> of ugly xml, and for cookies, lots of equally ugly key-value pairs)  
> into mozilla, I'm OK with applications wasting a lot of programmer  
> and run-time to deal with questions like "is this thing an address  
> or is it a domain name or is it something else?"
>
> And the inet_addr(3) thing has been around for a long, long time. I  
> recall writing it, and not for the first time, in XPG/1, in 1986.
>
> Eric
>
> Vint Cerf wrote:
>> Eric,
>>
>> I did not say to ban digits at all levels (and ENUM is an example  
>> of use of digits that does not cause confusion, for instance).
>>
>> The limitation in the TLD space does have the benefit that no  
>> domain name would have the property that it could be confused with  
>> an IP address. I think that is simpler to implement than trying to  
>> check how many labels there are in the domain name, and if four  
>> labels, can it be interpreted as an IP address. As Lyman Chapin  
>> points out, some decimal values are interpreted as 32 bit values  
>> and thus as IPv4 addresses by some systems.
>>
>> I am only speaking of the TLD label space here, and not lower level  
>> TLDs.  I don't know whether that makes a difference to you?
>>
>> v
>>
>>
>> Vint Cerf
>> Google
>> 1818 Library Street, Suite 400
>> Reston, VA 20190
>> 202-370-5637
>> vint at google.com
>>
>>
>>
>>
>> On Mar 9, 2009, at 6:44 PM, Eric Brunner-Williams wrote:
>>
>>> Vint,
>>>
>>> Your position then is that because _people_ may mistake sequences  
>>> of digits as addresses, that labels  be constrained to contain at  
>>> least one non-digit character, with the same constraint expressed  
>>> for octal and hex labels?
>>>
>>> Everyone has their own notion of what constitutes acceptable  
>>> dumbness, and anyone who thinks that
>>>
>>> 3.141592653589793238462643383279502884197169399375105820974944592.
>>>
>>> is an ip address (the name is taken from one of my favorite .com  
>>> examples) is not doing us any favors by insisting that we design  
>>> around his or her grasp of the details. Other than by going blind,  
>>> one space at a time (oh the joy of cards punched long forgotten,  
>>> and OS dumps before the invention of symbolic debuggers, also  
>>> mercifully long forgotten), what is the difference between the  
>>> above and the following:
>>>
>>> 3.141592653589793238462643383279S02884197169399375105820974944592.
>>>
>>> Did an infix alpha really buy us anything?
>>>
>>> Also, it simply isn't useful to state "DNS specs are not the sole  
>>> guide to conventions" without some specifics. What do we use?  
>>> Augury?
>>>
>>> I'm not keen on making the mistaken rule that "." in a string  
>>> handed to a resolver is punctuation and has a weak directionality  
>>> property, but if that has any use at all, that is, a limit on  
>>> leading and trailing digits, I'd prefer to see it at the registry,  
>>> as local policy, not the protocol, where independent of the  
>>> directionality of the label, or even the recourse to punycode, the  
>>> policy is global, and mostly incorrect.
>>>
>>> Eric
>>>
>>> Vint Cerf wrote:
>>>> Eric,
>>>>
>>>> On blackberry, so very briefly, DNS specs are not the sole guide  
>>>> to conventions. I think much pain would be avoided if we banned  
>>>> all numeric TLDs since this would assure no possible confusion of  
>>>> a host name and a IP address. Banning initial and trailing  
>>>> numerics might have bidi benefits but perhaps concerns there  
>>>> could be confined within the bidi rule set.
>>>>
>>>> V
>>>>
>>>> ----- Original Message -----
>>>> From: Eric Brunner-Williams <ebw at abenaki.wabanaki.net>
>>>> To: John C Klensin <klensin at jck.com>
>>>> Cc: Lyman Chapin <lyman at acm.org>; Martin Duerst <duerst at it.aoyama.ac.jp 
>>>> >; Andrew Sullivan <ajs at shinkuro.com>; Vint Cerf; idna-update at alvestrand.no 
>>>>  <idna-update at alvestrand.no>
>>>> Sent: Mon Mar 09 10:26:54 2009
>>>> Subject: numeric (ascii) labels (was: Re: draft-liman-tld- 
>>>> names-00.txt and bidi)
>>>>
>>>> Howdy,
>>>>
>>>> When the preliminary language to what is now ICANN's Guidebook  
>>>> for Applicants (GfA, but it has several alternate TLAs, just to  
>>>> be amusing), contained the "no numeric label" language, in  
>>>> decimal, octal and hex forms, I spent some time, initially with  
>>>> Kurt Pritz, and later with Olaf Nordling, to explain the   
>>>> inet_addr(3) issue.
>>>>
>>>> The language didn't change in GfAv2, issued two weeks ago, though  
>>>> someone did explain, as Lyman did below, that there is software  
>>>> which does the wrong thing. The GfAv2 text, like Lyman's, doesn't  
>>>> fully treat the cases to find the set of constraints which will  
>>>> allow a sequence of labels, some of which are numeric, to be  
>>>> strictly interpreted as a name, rather than as an address.
>>>>
>>>> In the history of ICANN's "new gTLD" effort(s), software which  
>>>> does the wrong thing has been ignored, e.g., the "terminal labels  
>>>> have length 4 or less" error (.arpa and the three and two ascii  
>>>> sequence labels, resulting in the temporary clobbering of .museum  
>>>> and other new gTLDs), and software which does the wrong thing has  
>>>> been controlling, e.g., the "email addresses are formed of 7-bit  
>>>> octet sequences" (a rationale for "A" in "IDNA"), the  
>>>> consequences are still before us today.
>>>>
>>>> My personal view is that broken code that isn't a defacto  
>>>> specification of the DNS, or broken specifications of things  
>>>> other than the DNS, need to go find their authors and get fixed,  
>>>> and not become dejure nuances of the "corrected" specifications  
>>>> of the DNS. In particular, it is reasonable for any zone admin,  
>>>> the IANA included, to make a registry-local rule reflecting  
>>>> momentary annoyance at the existence of well-known bugs, but that  
>>>> no such "rule" should be internalized to the DNS specs, with a  
>>>> vastly longer shelf-life than the random DNS (mis)using  
>>>> application.
>>>>
>>>> Yes, there is a bug (actually, a shared bug with multiple,  
>>>> possibly independent interoperable implementations of obvious  
>>>> brokenness), but 666 is no different from AAA, and a five label  
>>>> sequence composed of numeric (or octal or hex) character values  
>>>> is safe as houses (if ugly), and it is possible to constrain  
>>>> allocation of label sequences so that label sequences terminating  
>>>> in numeric (or octal or hex) character values, and having fewer  
>>>> than five labels, are also not incorrectly interpreted by this  
>>>> bug-set as dotted quads.
>>>>
>>>> Of course, ICANN is only a part of the design constraint, and one  
>>>> could say "0 is not allowed as a label in .", but the rational  
>>>> would be for reasons other than those in the DNS specs -- and in  
>>>> a separate note I'll address Limon's draft, which covers some of  
>>>> the issues addressed in 2929.
>>>>
>>>> Eric
>>>>
>>>>
>>>>
>>>> John C Klensin wrote:
>>>>
>>>>> --On Saturday, March 07, 2009 11:01 -0500 Lyman Chapin
>>>>> <lyman at acm.org> wrote:
>>>>>
>>>>>
>>>>>> Martin and Andrew,
>>>>>>
>>>>>> Although it seems that numeric values above 255 would be safe,
>>>>>> some   software looks only at the low-order 8 bits of a number
>>>>>> encoded in a   16-bit (for example) field (ignoring any
>>>>>> high-order bits) when it   "knows" that a numeric value will
>>>>>> always be 255 or less. In that case   only the 8 low-order
>>>>>> bits (10011010) of 666 (...01010011010) would be   recognized.
>>>>>> Entering "666" into such an interface would be equivalent   to
>>>>>> entering "154".
>>>>>>
>>>>> Lyman,
>>>>>
>>>>> I'm completely confused and don't know what you are talking
>>>>> about.  If the issue is domain names, expressed the preferred
>>>>> syntax of dot-separated ASCII characters, "666" is as good as
>>>>> "ABC" or "ACM".  If the issue is numeric values, the DNS spec
>>>>> understand only octets and not, e.g., 16 (UTF-16?) or 32
>>>>> (UTF-32/UCS-4) data fields.  The last I looked, it was quite
>>>>> hard to fit a decimal number larger than 255 into an octet.
>>>>>
>>>>> So, what are you saying?
>>>>>
>>>>>   john
>>>>>
>>>>> _______________________________________________
>>>>> Idna-update mailing list
>>>>> Idna-update at alvestrand.no
>>>>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>>>>
>>>>>
>>>>>
>>
>>
>>



More information about the Idna-update mailing list