Unconditional punycode conversion
Vint Cerf
vint at google.com
Wed Mar 9 19:11:52 CET 2011
the rule is in the definitions.
the string is valid for non-IDN aware programs but under IDNA, all strings
with "--" in positions 3,4 are reserved.
v
On Wed, Mar 9, 2011 at 1:08 PM, Simon Josefsson <simon at josefsson.org> wrote:
> Andrew Sullivan <ajs at shinkuro.com> writes:
>
> > On Wed, Mar 09, 2011 at 06:13:57PM +0100, Simon Josefsson wrote:
> >> Andrew Sullivan <ajs at shinkuro.com> writes:
> >>
> >> > On Wed, Mar 09, 2011 at 05:36:10PM +0100, Simon Josefsson wrote:
> >> >> To verify my understanding: the label "ab--cd" is permitted by
> IDNA2008
> >> >> despite it having "--" in the third and fourth characater positions?
> >> >> That would be because section 5.4 only applies to non-ascii labels.
> >> >
> >> > No. See section 2.3.1 of RFC 5890.
> >>
> >> I don't see any MUST/SHOULD language there. RFC 5891 says:
> >>
> >> Putative U-labels with any of the
> >> following characteristics MUST be rejected prior to DNS lookup:
> >> ...
> >> o Labels containing "--" (two consecutive hyphens) in the third and
> >> fourth character positions.
> >>
> >> Is "ab--cd" a putative U-label?
> >
> > Certainly not. It has no high bits. But,
> >
> > To facilitate clear description, two new subsets of LDH labels are
> > created by the introduction of IDNA. These are called Reserved LDH
> > labels (R-LDH labels) and Non-Reserved LDH labels (NR-LDH labels).
> > Reserved LDH labels, known as "tagged domain names" in some other
> > contexts, have the property that they contain "--" in the third and
> > fourth characters but which otherwise conform to LDH label rules.
> > Only a subset of the R-LDH labels can be used in IDNA-aware
> > applications. That subset consists of the class of labels that begin
> > with the prefix "xn--" (case independent), but otherwise conform to
> > the rules for LDH labels. That subset is called "XN-labels" in this
> > set of documents. XN-labels are further divided into those whose
> > remaining characters (after the "xn--") are valid output of the
> > Punycode algorithm [RFC3492] and those that are not (see below). The
> > XN-labels that are valid Punycode output are known as "A-labels" if
> > they also meet the other criteria for IDNA-validity described below.
> > Because LDH labels (and, indeed, any DNS label) must not be more than
> > 63 octets in length, the portion of an XN-label derived from the
> > Punycode algorithm is limited to no more than 59 ASCII characters.
> > Non-Reserved LDH labels are the set of valid LDH labels that do not
> > have "--" in the third and fourth positions.
> >
> > So, according to the above, NR-LDH labels never have -- in position 3
> > and 4. So ab--cd must be an R-LDH label.
> >
> > Of the R-LDH labels, only XN-labels are possibly A-labels.
> >
> > Only A-labels and U-labels are allowed under IDNA2008 (or NR-LDH
> > labels, but those aren't actually subject to IDNA2008 of course).
> >
> > This is illustrated in Figure 1 in RFC 5890, although only fans of
> > Venn diagrams (of which I am one) will find it helpful.
>
> I don't see any of this reflected in RFC 5891. As far as I can tell,
> "ab--cd" is permitted since there is no rule to forbid it.
>
> Is a new rule needed to forbid "ab--cd" in RFC 5891 or is there an error
> in the existing "--" rule for U-labels, or something else?
>
> /Simon
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110309/fa77713e/attachment-0001.html>
More information about the Idna-update
mailing list