A-label definition

Mark Andrews Mark_Andrews at isc.org
Mon Jun 23 04:51:50 CEST 2008


> John C Klensin wrote:
>  
> >> (1) LDH label, that's AFAIK 1 to 63 letters, digits,
> >>     and hyphens, not starting or ending with a hyphen.
>  
> > And not having two hyphens in the third or forth positions,
> > according to the current definition in idnabis-rationale.
> 
> That would be a major change to what is currently known as
> label in a FQDN of a host.  I hope for one "updates: 1123",
> but not twenty of "updates: ????" for the various RFCs with
> their own idea of a host <label>:
> 
> |  Domain         = sub-domain *("." sub-domain)
> |  sub-domain     = Let-dig [Ldh-str]
> |  Let-dig        = ALPHA / DIGIT
> |  Ldh-str        = *( ALPHA / DIGIT / "-" ) Let-dig
> 
> That is an example in a not yet approved RFC about SMTP ;-)
> 
> > Note that IIR 1035 doesn't say "LDH", and 1123 doesn't
> > either, they say "host name".
> 
> RFC 1035 defines <ldh-str> and <let-dig>, RFC 821 defines
> <ldh-str> and <let-dig>, RFC 937 uses <ldh>, 819, 882, 883,
> 1034, 2486, 2645, 2821, 3467, 3490, 3696, 3743, 4185, 4282,
> 4290, 4408, 4471, 4690, 4713, 5178.   That's what I found 
> with <http://purl.net/xyzzy/-a9/LDH+RFC> 

	Hostnames are defined in RFC 952 which is modified by 1123.
	RFC 1035 does NOT define hostnames.  It says to use
	*existing* rules.  For hostnames it says this would be
	those for hosts.txt, RFC 952.

	RFC 1123 also does not preclude alphanumeric.  What it does
	say is that all the currently allocated tlds (at the time
	of writing) are alphabetic and that because they are
	alphabetic there is no possiblilty of a clash with a dotted
	decimal notation for a IPv4 address.

	At best there is guidance not to allocate a TLD which will
	potentially clash with a representation of a IPv4 address.

		0xde.0xad.0xbe.0xef
		222.137.190.239
		0xdeadbeef
		0337.0211.0276.0357
		033653337357
		3735928559

	xn--* will never clash with a dotted decimial or any other
	representation of a IPv4 address.

	xn--* is a legal tld under RFC 952 and it was not made illegal
	by RFC 1123.

>  <toplabel> 
> > I hope that it is out of scope for this WG, but that is
> > certainly subject to debate.  As you know, I've written
> > the IESG asking them to give some priority to validating
> > that erratum.
> 
> I don't understand why you hope that this is out of scope.
> It has to be fixed for future IDN TLDs, and your erratum
> update killed the happy theory that RFC 3696 is the last
> word on <toplabel>, e.g., as used in the following draft:
>  
> <http://www.icann.org/topics/dns-stability-draft-paper-06feb08.pdf>
> 
> Folks are grabbing for anything, informational RFC or even
> unverified erratum, just to get any "authoritative" source
> about this.
> 
> > We probably should extend the 1123 rule to permit those
> > hyphens but, IMO, that is as far as we should go.
> 
> That is already good enough, there are only two variants,
> 
>   toplabel = <let> [1*61<l-d-h> <let-dig>]  ; variant 1
>   toplabel = <let> 0*61<l-d-h> <let-dig>    ; variant 2
> 
> Let's just pick what you like better, but not variant 1 :-)
> 
> > A combination of I-Ds, informational and experimental
> > documents, and opinions that don't represent demonstrated
> > community consensus.  Sorry if I don't find much
> > authority in these.
> 
> That is because everybody waits for you to say what you
> think is best in a published RFC on standards track with
> an "updates: 1123" note.  The USEFOR RFC is on standards
> track, with the 3696 version of variant 2 (= length two).
> 
> >> By definition an A-label is also a valid <toplabel>, 
> >> and we don't need to talk about this.
>  
> > By whose definition?
> 
> By your definition in either RFC 3696 or Errata ID 1353,
> and your definition in idnabis-rationale.  The latter
> defines (in prose)...
> 
> x-label = "xn--" *<l-d-h> "-" 1*<let-dig> ; length 6..63
> 
> ...and any valid A-label matches <x-label>.  Because any
> <x-label> also matches <ldh-label>, and any <toplabel> 
> is simply an <ldh-label> starting with a letter (length
> 1..63 or 2..63 depending on the chosen variant) I get:
> 
> * "x" is a letter
> * "xn--" + "-" + 1*<let-dig> has length 6, and 6 > 1 
> * 6..63 has the same maximal length as 1+61+1     
> 
> > all the ICANN test collection proves is that one can 
> > violate 1123 without causing very many problems, at
> > least for the mostly-web applications that have been
> > used in tests.
> 
> Joke - I had to fix my rxwhois client, anything with a
> hyphen went into the "guess what NIC handle" procedure.
> 
> > not obviously in the WG's charter.
> 
> | In particular, IDNs continue to use the "xn--" prefix
> 
> The Charter wants "xn--", it does not say "but not for
> TLDs".  Vint or Lisa would tell us if they don't want
> IDN TLDs for some obscure reason.
> 
>   <potentially open question: valid U-toplabel>
> >> Depending on the script "one code point" can express
> >> things that would need several letters in other 
> >> scripts.  ICANN can sort this out.
> 
> > It is not clear who gets to "sort this out".
> 
> What I wrote was a proposal.  Do you want to tackle the
> minimal length of an U-toplabel in Unicode code points ?
> 
> I'm not (yet) aware of technical reasons to do this, a
> corresponding A-toplabel has length 6..63, is that not
> good enough ?
> 
> > again, I hope that work doesn't belong to this WG.
> 
> That matches "ICANN can sort this out", it would be bad
> if we say "two code points", and some language in some
> script uses a single code point for "motherland".
> 
> The Chinese IDN test TLDs use only two code points for
> "test".  The Cyrillic RF proposal uses two code points,
> and it won't surprise me if somebody wants or needs one.
> 
> > The current rule (banning anything with "--" in positions
> > two and three that isn't a valid A-label) in IDNA2008
> > is extremely conservative wrt prefix forms as a means
> > of avoiding nonsense
> 
> Nobody can prevent me from creating a label fe--2008-11-11,
> it is LDH, and it makes sense from my POV.  How could we
> find out if somebody uses similar labels already, and get
> them to change it ?
> 
> The IDNA "xn--" approach used a proper subset of LDH for
> its purposes out of necessity, but I see no technical
> necessity to say that other LDH subsets are *invalid*.  
> 
> IMO figuring out which <x-label>s (see above) are valid
> A-labels is interesting enough.  
> 
> > That isn't much of a restriction, since no one has
> > really demonstrated a need for such strings.
> 
> There is no need to have hmdmhdfmhdjmzdtjmzdtzktdkztdjz
> as label, nevertheless I ended up with it, after a piece
> of software rejected about a dozen less obscure ideas,
> and I lost my patience.  IIRC I needed a working jabber
> account fast, I wasn't aware that this would be a label
> and local part later.
> 
> > If the WG concludes that is excessive and wants to
> > drop back all or part of the way to a rule that merely
> > says that, if the label starts in "xn--", it must be
> > an A-label, I won't lose any sleep over it...
> 
> I guess you could say that any <x-label> that is not a
> valid A-label MUST NOT be registered as <toplabel>, and
> that it also MUST NOT be registered in any "decent" TLD
> registry (at any level managed by the TLD registry).
> 
> That is already difficult, constructed example, what if
> an URI scheme xn--foo needs xn--foo.uri.arpa ?  Subtle
> point, this is no <x-label> as defined above.
> 
>  <xn--cocacola> 
> > If one decides that an A-label that cannot satisfy
> > those rules is "whatever it is", one ends up with a
> > string with two possible interpretations depending
> > on the version of Unicode being used
> 
> Okay, to eliminate any "it is not even an <x-label>"
> argument let's take xn--coca-cola, a valid <x-label>.
> 
> The MUSTard would guarantee that xn--coca-cola cannot
> be registered if it has no corresponding U-label for
> Unicode 5.1 (caveat, maybe it has, I didn't check it).
> 
> At other levels folks will do what they want no matter
> what IDNAbis tries to decree.  Applications could not
> decode it to an U-label, because there is no U-label.
> 
> Isn't that good enough, treat xn--coca-cola "as is" ?
> 
>  Frank
> 
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: Mark_Andrews at isc.org


More information about the Idna-update mailing list