comments on last call drafts

Tue Oct 13 20:41:34 CEST 2009

Hey, I just read through the Last Call drafts and have some comments.

defs-11:

  * 2.3.2.1, the end of the definition of A-label is broken:

        If and only if a string meeting can be decoded into a
        U-label, then it is an A-label.

    Comparing with defs-10, it looks like "a string meeting" should be
    "a string meeting the above requirements".

  * 2.3.1:

        That subset is called 'XN-labels' in this set of documents.

    The term gets imported into rationale-13, but it is not actually
    really *used* anywhere outside the defs document. Grepping for
    "xn--" shows some places that definitely could be using it.

rationale-13:

  * 7.4 (The Question of Prefix Changes) and its subsections are still
    worded as though IDNA2008 was a work in progress. Eg:

        An IDNA upgrade would require a prefix change if... [t]he
        conversion of an A-label to Unicode (i.e., a U-label) yields
        one string under IDNA2003 (RFC3490) and a different string
        under IDNA2008.

    If the (current) goal of the section is to document the sorts of
    changes from IDNA2003 that *would have* required a prefix change,
    then it should be more past-tense-y. If the goal is to document
    the sorts of possible changes that might require a prefix change
    *in the future*, then it should contrast IDNA2008 with that future
    spec, not IDNA2003 vs IDNA2008.

protocol-16:

  * 1. Introduction:

        IDNA applies only to DNS labels. The base DNS standards
        [RFC1034] [RFC1035] and their various updates specify how to
        combine labels into fully-qualified domain names and parse
        labels out of those names.

        This document describes two separate protocols, one for IDN
        registration (Section 4) and one for IDN lookup (Section 5).

    If "IDNA applies only to DNS labels", then you should be able to
    look up "_ldap._tcp.exámplè.net". But if the protocol is "for IDN
    lookup" then you can't, because a domain name containing an
    underscore label isn't an IDN.

    In general, the document seems to frequently ignore the
    distinction between labels and domain names. Eg, "Whenever a
    domain name is put into an IDN-unaware domain name slot... it...
    must be either an A-label or an NR-LDH-label" (3.1) and "The
    A-label resulting from the conversion in Section 5.5... is looked
    up in the DNS, using normal DNS resolver procedures" (5.6). Some
    of this could be fixed by adding an explicit "for each label in
    the domain name" loop around the middle of the registration and
    lookup protocols.

  * 3.1. (Requirements) forbids putting a U-label into an IDN-unaware
    slot, but doesn't say what an app needing to convert a U-label to
    an A-label actually *can* do in this case, since there is no
    protocol for converting a U-label that you are neither registering
    nor looking up.

    I'm assuming you're supposed to use the lookup protocol, minus the
    actual DNS lookup step, but nothing anywhere actually says that
    you can/should do that. (Maybe there are security issues with
    handing an A-label to an IDN-unaware app that might warrant
    additional checks beyond the lookup case?)

  * 3.2.1. DNS Resource Records:

        IDNA applies only to domain names in the NAME and RDATA
        fields of DNS resource records whose CLASS is IN. See RFC
        1034 [RFC1034] for precise definitions of these terms.

    Those terms come from RFC 1035, not 1034.

  * 4.2.4. (Registration Validation Summary) shouldn't really be
    called a "summary", since it introduces two new restrictions not
    previously mentioned ("at least one non-ASCII character" and "63
    or fewer characters long in ACE form"). Also, I think the
    reference to Section 4.2 should be to Section 4.2.3? (Otherwise
    you get infinite recursion...)

  * 4.4 Punycode Conversion:

        The failure conditions identified in the Punycode encoding
        procedure cannot occur if the input is a U-label as determined
        by the steps above.

    But "the steps above" require running the Punycode encoding
    procedure on the putative U-label to determine its length when ACE
    encoded, so you won't know if it's a real U-label until after
    running Punycode and possibly overflowing. So if this sentence was
    meant to imply that you don't need to check for overflow, then
    it's wrong (and if it's not meant to imply that, then it's
    misleading.)

bidi-06:

  * 4.2, the running text states that the Yiddish representation of
    "YIVO" is "YOD YOD HIRIQ VAV VAV ALEF QAMATS" (two YODs and two
    VAVs), but the list of codepoints immediately following it only
    has only one YOD and one VAV.

  * 4.3:

        By requiring that the first or last character of a string be
        category R or AL, RFC 3454 prohibited a string containing
        right-to-left characters from ending with a number.

    "first or last character" seems like it should just be "last
    character"?

-- Dan