Unconditional punycode conversion

Vint Cerf vint at google.com
Wed Mar 9 19:11:52 CET 2011


the rule is in the definitions.

the string is valid for non-IDN aware programs but under IDNA, all strings
with "--" in positions 3,4 are reserved.

v


On Wed, Mar 9, 2011 at 1:08 PM, Simon Josefsson <simon at josefsson.org> wrote:

> Andrew Sullivan <ajs at shinkuro.com> writes:
>
> > On Wed, Mar 09, 2011 at 06:13:57PM +0100, Simon Josefsson wrote:
> >> Andrew Sullivan <ajs at shinkuro.com> writes:
> >>
> >> > On Wed, Mar 09, 2011 at 05:36:10PM +0100, Simon Josefsson wrote:
> >> >> To verify my understanding: the label "ab--cd" is permitted by
> IDNA2008
> >> >> despite it having "--" in the third and fourth characater positions?
> >> >> That would be because section 5.4 only applies to non-ascii labels.
> >> >
> >> > No.  See section 2.3.1 of RFC 5890.
> >>
> >> I don't see any MUST/SHOULD language there.  RFC 5891 says:
> >>
> >>    Putative U-labels with any of the
> >>    following characteristics MUST be rejected prior to DNS lookup:
> >> ...
> >>    o  Labels containing "--" (two consecutive hyphens) in the third and
> >>       fourth character positions.
> >>
> >> Is "ab--cd" a putative U-label?
> >
> > Certainly not.  It has no high bits.  But,
> >
> >    To facilitate clear description, two new subsets of LDH labels are
> >    created by the introduction of IDNA.  These are called Reserved LDH
> >    labels (R-LDH labels) and Non-Reserved LDH labels (NR-LDH labels).
> >    Reserved LDH labels, known as "tagged domain names" in some other
> >    contexts, have the property that they contain "--" in the third and
> >    fourth characters but which otherwise conform to LDH label rules.
> >    Only a subset of the R-LDH labels can be used in IDNA-aware
> >    applications.  That subset consists of the class of labels that begin
> >    with the prefix "xn--" (case independent), but otherwise conform to
> >    the rules for LDH labels.  That subset is called "XN-labels" in this
> >    set of documents.  XN-labels are further divided into those whose
> >    remaining characters (after the "xn--") are valid output of the
> >    Punycode algorithm [RFC3492] and those that are not (see below).  The
> >    XN-labels that are valid Punycode output are known as "A-labels" if
> >    they also meet the other criteria for IDNA-validity described below.
> >    Because LDH labels (and, indeed, any DNS label) must not be more than
> >    63 octets in length, the portion of an XN-label derived from the
> >    Punycode algorithm is limited to no more than 59 ASCII characters.
> >    Non-Reserved LDH labels are the set of valid LDH labels that do not
> >    have "--" in the third and fourth positions.
> >
> > So, according to the above, NR-LDH labels never have -- in position 3
> > and 4.  So ab--cd must be an R-LDH label.
> >
> > Of the R-LDH labels, only XN-labels are possibly A-labels.
> >
> > Only A-labels and U-labels are allowed under IDNA2008 (or NR-LDH
> > labels, but those aren't actually subject to IDNA2008 of course).
> >
> > This is illustrated in Figure 1 in RFC 5890, although only fans of
> > Venn diagrams (of which I am one) will find it helpful.
>
> I don't see any of this reflected in RFC 5891.  As far as I can tell,
> "ab--cd" is permitted since there is no rule to forbid it.
>
> Is a new rule needed to forbid "ab--cd" in RFC 5891 or is there an error
> in the existing "--" rule for U-labels, or something else?
>
> /Simon
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.alvestrand.no/pipermail/idna-update/attachments/20110309/fa77713e/attachment-0001.html>


More information about the Idna-update mailing list