Comments on idnabis-protocol-02

Sun Jul 27 09:10:22 CEST 2008

Marcos,

Since no one else has commented on either of your notes from the
17th, I've made a pass through them.  Comments removed from the
list below have been incorporated into the working draft either
directly, as additional discussion placeholders, or both.

Many thanks for the careful reading.

--On Thursday, 17 July, 2008 09:50 +0200 "Marcos Sanz/Denic"
<sanz at denic.de> wrote:

>...
> * Section 4.1: The text block starting with "The registry MAY
> permit  [...]" and ending with "[...] MUST be rejected" could
> be better placed  under Section 4.3, since subsections of
> section 4 are thought as logical  steps in time.

I think it belongs where it is, for exactly that reason.
Sections 4.2 and 4.3 assume a native character string (putative
U-label) and make little sense if an A-label is provided
instead.  More explanation from you, or comments from others,
would be welcome.

>...

> * Section 4.2: "U-labels actually produced from A-labels".
> Doesn't the  definition of "U-label", as of idnabis-rationale,
> include the assumption  that it actually must be produced (or
> have been produced) from some  A-label? So the formulation is
> redundant/misleading.

No, the definition is (or should be) "must have been capable of
being produced", not "was produced".  But suggestions about
better text here would be welcome.

> * Section 4.3.2.2: As a matter of fact, this step is an
> special  instantiation of 4.3.2.3 ("all combining marks have a
> contextual rule that  does not allow them to appear at the
> beginning of a label"). Shouldn't  thus be subsumed into it?
> This way there would be different "kinds" of  rules and would
> contribute to simplicity.

While this would be elegant, characters that have contextual
rules are identified separately in Tables and much have entries
in the Contextual Rules registry.  There are just too many
combining marks to make treating them that way practical.

> * Section 4.3.3: I'd drop the sentence starting with "For
> example", since  this section is a summary of the rest of the
> section and it should be kept  crispy. If at all, the example
> should appear in the corresponding  subsection of 4.3

See the note that is now in the text.  We've got a tension
between "show examples of be accused of making cases up" and
"crispy".

> * Section 4.4: s/SHOULD/should/. See my comment on section 6.2
> of the  rationale document (sent in separate mail). Usage of
> 2119-language should  be motivated by interoperability, which
> is not an issue here.

See entry P.6 on the issues list.

> * Section 4.5: There should be a hint for implementors on how
> to act if  the Punycode operation fails (or, alternatively, an
> explanation for why  the failure situations described in 3492
> cannot happen here at all).

Noted in text

>...
> * Section 5: "The resolution-side tests are more permissive
> and rely  heavily on the assumption that names that are
> present in the DNS are  valid". This is a dangerous assumption
> and can lead to careless  programming. See my comment on of
> section 10.1.2 of idnabis-rationale,  sent in separate e-mai.

Comments in the notes on Rationale, but note that the assumption
that, if a string is in the DNS it is valid, is much stronger in
IDNA2003 than in the present work, which makes significantly
more precautionary tests.

>...
> * Section 5.5: The six bullets are very simmilar in content
> (but not in  wording) to those under section 4.3. That makes
> it difficult to  implementors to follow ("why is the text
> different? is there some subtle  meaning I am missing?") and
> adds unnecessary verbose to the specification.  I suggest
> collecting the steps which are identical in registration and
> in  lookup, putting them in just one separate section called
> "Basic  Registration And Lookup Checks" and refering that
> section from 4.3 and  5.5.

Added to Issues list as P.15.  Note that this interacts with
P.10.

> * Section 5.5, regarding anchor 20: if a label not satisfying
> the  idna2008-bidi requirements is not IDNA-valid, there is no
> point in letting  a resolver query that U-label, it can
> straightahead deliver a failure. So  IMHO the "SHOULD" should
> be a "MUST".

Waiting to hear what others think but, in any event, the text
has been adjusted to make the intent more clear.

> * Section 5.5: "the resolver MUST rely on the presence or
> absence of  labels in the DNS to determine the validity of
> those labels". Actually, it  can only be "to determine the
> existence of those labels", nothing further.

I'm not sure I understand this one.  If I do, then "exists"
creates a presumption of validity, at least up to a point.  See
above and Issue P.15.

>...
> * Section 7, 3rd paragraph: "privileged or anti privileged
> domains". I  haven't the slightest idea what is that supposed
> to mean.

Not my text (I think it came from 3490).  Rewritten to make it
more clear.

> * Appendix A: Neither here nor in rationale-01, section 13.2 I
> can find a  requirement for the IANA Contextual Rules Registry
> to be versioned. It  might be obvious, but it should be made
> explicit. This versioning must not  necessarily follow from
> Unicode versioning (one could imagine changes in  it that are
> not directly bound to Unicode progress). The same goes, btw, 
> for the derived property registry.

Noted as Issue P.16.   However, the IETF has not had good
success with version models.   What would you do with a version
number or equivalent if you had it?

>...
> * Appendix B: I am not sure of the usefulness of this whole
> Appendix;  major programming languages support directly
> Unicode Regexps, and if some  doesn't, the programmer can
> check widely available documentation.  Regarding anchor41:
> What part of a construction like "\p(Script:XXX)" is  fairly
> exotic? And how exotic is it in comparison with the bidi rules
> or  with the elaboration of the derived property? Keeping
> Appendix B will lead  to duplication of efforts and chances
> for inconsistence (for instance,  right at the beginning on
> the character hyphen-minus: "Must appear [sic]  at the
> beginning or end of a label"...). Though well-intended, I
> suggest  dropping the effort of Appendix B.

There isn't going to be any more effort.  If there is consensus
to keep the regular expressions, then Appendix B goes away.  If
there is consensus to keep the sequential rule form, then
Appendix A goes away.   Keeping both would be a maintenance
nightmare and an invitation to all sorts of bad things and was
never the intent.

Thanks again,
    john