A-label definition
John C Klensin
klensin at jck.com
Sat Jun 21 01:20:08 CEST 2008
--On Friday, 20 June, 2008 06:04 +0200 Frank Ellermann
<hmdmhdfmhdjmzdtjmzdtzktdkztdjz at gmail.com> wrote:
> YAO Jiankang wrote:
>
>> It is better if we clarify 3 definitions.
>
>> LDH , which is the domain name lable defined in RFC 1034 and
>> 1035
>
> I think we need the RFC 3696 concept "updating" the RFC 1123
> idea, which in turn updated RFC 1034/1035. It's a subtle
> point, but the older definitions had "must start with a
> letter", that's obsolete.
>
> RFC 1123 fixed that claiming that the <toplabel> can contain
> only letters, but that would rule out IDN TLDs. RFC 3696
> finally got it right, but did not update RFC 1123. Somebody
> has to fix this, we are in the position to do it.
3696 is an informational document that doesn't update
anything... and can't. Even had it been standards-track, it
deliberately takes a very permissive view toward what is
permitted and has little to do with the present situation, much
less what is permitted as a TLD name.
>> U-label , which contains at least a non-ASCII character
>
> Okay, but please without the "standard Unicode encoding" blurb,
> it only needs Unicode code points (the numbers, any encoding).
No. Unless I misunderstand what you are asking for, it really
is important that U-label and A-label refer to _valid_ IDNA
label forms. If we go off into an adventure into "any Unicode
code point", we rapidly slide down the slippery slope toward
labels that are IDNA-invalid (which is part of what caused so
much confusion about what "punycode" referred to and they use of
binary and other non-character labels (see RFC 2181).
>> A-label, which is transformed from U-label with the algorithm
>> (punycode), plus a prefix such as XN-- (some lable withe the
>> prefix XN-- can not be converted to U-label is not valid
>> A-label)
>
> +1, define A-label based on U-label, and not the other way
> around.
At the moment, neither is defined in terms of the other (in
rationale). There is an implication of linkage, but that is
because A-labels have to be IDNA-valid and IDNA-validity is
defined in terms of operations on U-labels. What are you
suggesting?
>> LDH label includes A-label.
>
> +1, that is the whole point of this business.
No, actually, "rationale" creates, effectively, four categories
which are disjoint:
* LDH labels (as defined in 1035, with no prefix or
other IDNA implications)
* A-labels (prefix, punycode encoding of the rest of the
string, IDNA-valid)
* U-labels (Unicode string that is valid under IDNA)
* Invalid
Treating A-labels as a subset of LDH labels gets us back into
situations in which there are LDH labels that look like A-labels
and aren't. And that has been a _huge_ source of confusion and
something of a source of bad behavior.
john
More information about the Idna-update
mailing list