A-label definition

John C Klensin klensin at jck.com
Sat Jun 21 01:20:08 CEST 2008



--On Friday, 20 June, 2008 06:04 +0200 Frank Ellermann
<hmdmhdfmhdjmzdtjmzdtzktdkztdjz at gmail.com> wrote:

> YAO Jiankang wrote:
> 
>> It is better if we clarify 3 definitions.
> 
>> LDH , which is the domain name lable defined in RFC 1034 and
>> 1035
> 
> I think we need the RFC 3696 concept "updating" the RFC 1123
> idea, which in turn updated RFC 1034/1035.  It's a subtle
> point, but the older definitions had "must start with a
> letter", that's obsolete.
> 
> RFC 1123 fixed that claiming that the <toplabel> can contain
> only letters, but that would rule out IDN TLDs.  RFC 3696
> finally got it right, but did not update RFC 1123.  Somebody
> has to fix this, we are in the position to do it.  

3696 is an informational document that doesn't update
anything... and can't.  Even had it been standards-track, it
deliberately takes a very permissive view toward what is
permitted and has little to do with the present situation, much
less what is permitted as a TLD name.

>> U-label , which contains at least a non-ASCII character
> 
> Okay, but please without the "standard Unicode encoding" blurb,
> it only needs Unicode code points (the numbers, any encoding).

No.  Unless I misunderstand what you are asking for, it really
is important that U-label and A-label refer to _valid_ IDNA
label forms.  If we go off into an adventure into "any Unicode
code point", we rapidly slide down the slippery slope toward
labels that are IDNA-invalid (which is part of what caused so
much confusion about what "punycode" referred to and they use of
binary and other non-character labels (see RFC 2181).

>> A-label, which is transformed from U-label with the algorithm
>> (punycode), plus a prefix such as XN--  (some lable withe the
>> prefix XN-- can not be converted to U-label is not valid
>> A-label)
> 
> +1, define A-label based on U-label, and not the other way
> around.

At the moment, neither is defined in terms of the other (in
rationale).   There is an implication of linkage, but that is
because A-labels have to be IDNA-valid and IDNA-validity is
defined in terms of operations on U-labels.  What are you
suggesting?

>> LDH label includes A-label.
> 
> +1, that is the whole point of this business.

No, actually, "rationale" creates, effectively, four categories
which are disjoint:

	* LDH labels (as defined in 1035, with no prefix or
	other IDNA implications)
	* A-labels (prefix, punycode encoding of the rest of the
	string, IDNA-valid)
	* U-labels (Unicode string that is valid under IDNA)
	* Invalid

Treating A-labels as a subset of LDH labels gets us back into
situations in which there are LDH labels that look like A-labels
and aren't.  And that has been a _huge_ source of confusion and
something of a source of bad behavior.

    john


More information about the Idna-update mailing list