Single-letter names (was: Re: Update of RFC 2606 based on the recent ICANN changes?)

John C Klensin john-ietf at jck.com
Fri Jul 4 19:50:01 CEST 2008


Vint,

In the ASCII space, there have been three explanations offered
historically for the one-character prohibition on top and
second-level domains.   I've written variations on this note
several times, so will just try to summarize here.  Of the
three, the first of these is at best of only historical interest
and may be apocryphal and the second is almost certainly no
longer relevant.  The third remains significant.

(1) Jon has been quoted as suggesting that we could have
eliminated many of the problems we now face with TLDs and
simultaneously made the "no real semantics in TLD names" rule
much more clear had we initially allocated "b".."y" as TLDs.
Then, when someone asked for an assignment, it would have been
allocated at random to one of those domains.  While this has a
certain amount of appeal, at least in retrospect, there is
probably no way to get from where we are today to that model...
unless actions taken in the near future so ruin the current DNS
tree as a locus for stable and predictable references that we
need to start over with a new tree.  I don't think that a "have
to start over" scenario is at all likely, but I no long believe
it to be impossible.

(2) There was an idea floating around for a while that, if some
of the popular TLDs "filled up", one could create single-letter
subdomains and push subsequent registrations down the tree a
bit.  For example, if .COM were declared "full", then "a.com",
"b.com", etc., would be allocated and additional reservations
pushed into subdomains of those intermediate domains rather than
being registered at the second level.  Until and unless the
conventional wisdom that adding more names to .COM merely
requires more hardware  and/or bandwidth, that won't be a
"filled up" point at which this sort of strategy could be
triggered.  Worse, trying to use single-letter subdomains as an
expansion mechanism would raise political issues about putting
latecomers at an advantage that would be, IMO, sufficient to
completely kill the idea.  In the current climate, I think the
community would decide that it preferred a disfunctional DNS if
that were ever the choice (see the "start over" remark above).

(3) At least in the discussions that led up to RFC 1591, and
probably much earlier, there were concerns about reducing the
likelihood of false hits if the end user made single-character
typing errors.  With only 26 (or maybe 36) possible characters,
it could just about be guaranteed that all of them would be
registered and that _any_ typing error would yield a false
match.  That, in itself, has been considered sufficient to
prohibit single-letter labels and, by extension, to be fairly
careful about two-letter ones.   There have been arguments on
and off over the years as to whether this is a "technical"
reason or an attempt to set policy.  Even though the mismatches
would obviously not cause the network to explode or IP to stop
working, at least some of us consider the informational
retrieval and information theoretic reasons to insist on more
information in domain name labels in order to lower the risk of
false positive matches to be fully as "technical" as something
that would have obvious lower-level network consequences.
Others --frankly especially those who see commercial advantage
in getting single-letter names-- have argued that this position
is just a policy decision in disguise.

Note that, with slight modifications, the second and third
arguments apply equally well to TLD allocations and to SLD
allocations, especially in popular domains.  

The reasoning associated with the third case also applies to any
other script that contains a fairly small number of characters.
One could manage a long philosophical discussion as to whether
there are sufficient characters in the fully-decorated
Latin-derived collection to eliminate the problem, but an
analysis of keyboard and typing techniques/ input methods for
that range of characters would, IMO, yield the same answer --
single-letter domains are just not a good idea and two-letter
ones near the top of the tree should be used only with great
caution.   

On the other hand, the same reasoning would break down when
confronted with a script that contains thousands of characters,
such as the "ideographic" ones.  There are enough characters
available in those scripts that one can presumably not worry
about single-character typing errors (and one can perhaps worry
even less if the usual input methods involve typing
phonetically, using a different script, and then selecting the
relevant characters from a menu -- in those cases, the phonetic
representations are typically more than a character or two long
and the menu selection provides an extra check about false
matches).

     john



--On Thursday, 03 July, 2008 19:04 -0400 Vint Cerf
<vint at google.com> wrote:

> seems odd to me too, James.
> 
> vint
> 
> 
> On Jul 3, 2008, at 6:14 PM, James Seng wrote:
> 
>>> At the moment, the condition is "no single Unicode code
>>> point." To the extent that a single CJK ideograph can be
>>> expressed using a single Unicode code point, this would
>>> represent the situation to which you say you would object. I
>>> will dig through my notes to find out why the "single
>>> character" condition was adopted -
>> 
>> Would you be able to explain why the condition is "no single
>> Unicode code point"? Whats the technical basis for that?






More information about the Idna-update mailing list