LDH-label terminology (was: Re: Comments on idnabis-rationale-01)

John C Klensin klensin at jck.com
Sat Jul 26 00:15:44 CEST 2008


--On Thursday, 24 July, 2008 20:37 -0400 Eric Brunner-Williams
<ebw at abenaki.wabanaki.net> wrote:

> Data label taxonomy:
> 
> binary (rfc2673) [any bit-boundary]
>     text (none) [zero or more octets]
>        ascii (none) [zero or more octets, encoding]
>           ldh (rfc1035) [obvious subset]
>             a-label (idnabis) [xn-- prefix, plus stuff]
>             z-labels (none) [anything with an "--", not
> preceeded by  "xn", not ^anchored, to suggest an answer to
> Frank's question]

Ok.  But, to do a bit of nit-picking first (possibly important
because we are trying to be clear about terminology here):

(1) 1035 doesn't ever use the term "LDH".  It does use a
production <ldh-str> that is _part_ of the definition of <label>
in its "fewer problems" syntax.   The "obvious" definition of
"ldh" would permit labels like "--foo--", which is obviously not
what is intended in 1035 (or any of the many other places that
define what you are trying to get at).

(2) As I read 2673, the binary labels it defines are usable only
with EDNS0 or better and require an extended type code because
they are concatenated one-bit labels.   That makes them a
disjoint category from the octet-aligned labels that you refer
to as "text".   What is normally thought of a "text" (not
necessarily ASCII and not well defined) is a subset of a
traditional label.  Quoting from Section 11 of RFC 2181, "Those
restrictions aside, any binary string whatever can be used as
the label of any resource record".  "Those restrictions" are the
length rules and, implicitly, octets.

So that nit would make the upper levels of your taxonomy

 Data label taxonomy:

 bit-string (rfc2673) [any bit-boundary]
 arbitrary-label (rfc1034, 1035, 2181) [1 to 63 octets, 
           otherwise arbitrary binary string]
     text (none) [probably unnecessary, but maybe, 
           e.g., no all-zero octets]
        ascii (none) [zero or more octets, encoding]
             [...]

Even if you do things the way you did in your version of the
taxonomy, or change "ldh" into some abbreviation for
"LDH-rule-conforming-string" ("ldh*", below), we would need,
IMO, and using your definitions above

      ldh* (rfc1035) [obvious subset]
        a-label (idnabis) [xn-- prefix, plus stuff]
        xn-junk-label (xn-prefix, but not a valid A-label)
        z-labels (none, but with "--" in 3,4 and not "xn--")
        ldh-not-IDN1 (everything else)

or

      ldh* (rfc1035) [obvious subset]
        a-label (idnabis) [xn-- prefix, plus stuff]
        xn-junk-label (xn-prefix, but not a valid A-label)
        ldh-not-IDN2 (everything else)

Now, I can write the spec without having to get involved with
"z-labels" (I think Rationale now does that, spelling
"ldh-not-IDN2" as "LDH-label").  I don't know how to write it
without xn-junk-label and either ldh-not-IDN1 or ldh-not-IDN2.  

So it still leaves us needing a term for ldh-not-IDN1 or
ldh-not-IDN2.  

I may be just be blocked on this, so, if people see a way to
write the text and avoid the need for either of the terms for
which those are placeholders, I'd appreciate specific textual
suggestions, not just complaints about the terms (see below).


--On Thursday, 24 July, 2008 18:43 -0700 Paul Hoffman
<phoffman at imc.org> wrote:

> At 6:52 PM -0400 7/24/08, Vint Cerf wrote:
>> if A-Label is confined to the "xn-- <punycode lower case
>> stuff>" then it is a subset of LDH
> 
> Whenever I think of the terms, I agree with Vint's and Frank's 
> definition, not the one used in the current documents. I'm
> surprised  that Tina and Cary think the terms are being
> understood by others.

Hmm.  Vint will have to clarify what he actually intended, but
certainly A-labels are LDH-conformant.  I didn't take his remark
as saying anything else.  As far as I know, no one has ever
suggested otherwise.  That is, after all, the whole point of
IDNA.   The question is whether "LDH-label" is an appropriate
term for what I've identified as ldh-not-IDN2 above.

Having spent more time than I would like in ICANN and related
contexts in recent years, I share their belief.   To be quite
precise, the general understanding of the DNS and DNS
terminology in that community is fairly poor.  I believe that
the vast majority of ICANN participants would be amazed to
discover that "_TCP" is a valid label and that the notions of
either arbitrary octet-aligned binary labels or bit-string
labels would blow their minds.   But, within the limits of their
understanding, it has become generally understood and
comfortable to believe that an LDH-label is an ASCII label that
conforms to the LDH rule and has nothing to do with IDNs, while
A-labels are IDNs as stored in the DNS.  

There is some deliberate imprecision in that definition that is
also consistent with what I believe to be the general level of
understanding around ICANN.   Because so few people in that
community have heard about SRV, binary labels, bit-string
labels, etc., or expressed any desire to ever register or use
zz--12345, I believe that questions about how they would see or
describe the rest of the taxonomy that Eric outlines (with or
without corrections) would draw a lot of blank looks (plus a few
wrong answers delivered with great conviction).

The biggest problem I see with the "LDH-label" terminology is,
for that reason, not the term itself.  It is that, unless we
start making assertions about things that I think we've agreed
are out of WG scope, it is hard to define precisely enough to
stand up to a comprehensive DNS taxonomy (Eric's or someone
else's), even if it turns out to be good enough for the typical
casual user of, or zone administrator for, the DNS.   That is
probably justification for trying to rewrite the relevant text
to avoid the need for any term of that sort, but see above.

    john



More information about the Idna-update mailing list