Comments on idnabis-protocol-02
Marcos Sanz/Denic
sanz at denic.de
Thu Jul 17 09:50:53 CEST 2008
All,
and finally, some various comments on protocol-02:
* Section 3.2.1: s/and because/because/
* Section 3.2.1: s/hyphen./hyphen.)/
* Section 4.1: The text block starting with "The registry MAY permit
[...]" and ending with "[...] MUST be rejected" could be better placed
under Section 4.3, since subsections of section 4 are thought as logical
steps in time.
* Section 4.2: This step starts with "Some system routine [...] ensures
that the proposed label is a Unicode string". But it may *not* be a
Unicode string, since the output of section 4.1 is, as defined, in a local
native character set. That is, the step of conversion to Unicode is
missing.
* Section 4.2: "U-labels actually produced from A-labels". Doesn't the
definition of "U-label", as of idnabis-rationale, include the assumption
that it actually must be produced (or have been produced) from some
A-label? So the formulation is redundant/misleading.
* Section 4.3.2.2: As a matter of fact, this step is an special
instantiation of 4.3.2.3 ("all combining marks have a contextual rule that
does not allow them to appear at the beginning of a label"). Shouldn't
thus be subsumed into it? This way there would be different "kinds" of
rules and would contribute to simplicity.
* Section 4.3.3: I'd drop the sentence starting with "For example", since
this section is a summary of the rest of the section and it should be kept
crispy. If at all, the example should appear in the corresponding
subsection of 4.3
* Section 4.4: s/SHOULD/should/. See my comment on section 6.2 of the
rationale document (sent in separate mail). Usage of 2119-language should
be motivated by interoperability, which is not an issue here.
* Section 4.5: There should be a hint for implementors on how to act if
the Punycode operation fails (or, alternatively, an explanation for why
the failure situations described in 3492 cannot happen here at all).
* Section 4.5: s/the prefix/the ACE prefix/
* Section 5: "The resolution-side tests are more permissive and rely
heavily on the assumption that names that are present in the DNS are
valid". This is a dangerous assumption and can lead to careless
programming. See my comment on of section 10.1.2 of idnabis-rationale,
sent in separate e-mai.
* Section 5: Sentence starting with "Among other things, this distinction
[...]" is a bit irrelevant here, it should be moved to the rationale
document (and I even think that I already read about this issue there).
* Section 5.3, 2nd paragraph: 'mapping different "width" forms of the same
character'. Without context, it is a bit difficult to understand what is
meant here (I happen to know, but I think it needs a bit more phrasing for
a casual reader).
* Section 5.3: "Such localization changes are even further outside the
scope of this specification than the ones mentioned above". I think the
language is not appropriate for a standards document, and it should
suffice with a "Such localization changes are also outside the scope of
this specification".
* Section 5.5: "In parallel with the registration procedure [...], the
Unicode string is checked...". What does "in parallel" mean here?
Certainly not time synchronicity. Wouldn't it be clearer "Simmilar to the
registration procedure"?
* Section 5.5: The six bullets are very simmilar in content (but not in
wording) to those under section 4.3. That makes it difficult to
implementors to follow ("why is the text different? is there some subtle
meaning I am missing?") and adds unnecessary verbose to the specification.
I suggest collecting the steps which are identical in registration and in
lookup, putting them in just one separate section called "Basic
Registration And Lookup Checks" and refering that section from 4.3 and
5.5.
* Section 5.5, regarding anchor 20: if a label not satisfying the
idna2008-bidi requirements is not IDNA-valid, there is no point in letting
a resolver query that U-label, it can straightahead deliver a failure. So
IMHO the "SHOULD" should be a "MUST".
* Section 5.5: "the resolver MUST rely on the presence or absence of
labels in the DNS to determine the validity of those labels". Actually, it
can only be "to determine the existence of those labels", nothing further.
* Section 5.6: "[...] is converted to an A-label using the punycode
algorithm." Add: "and prepending the ACE prefix". Btw, Punycode is
sometimes capitalized, sometimes not.
* Section 7: s/compatable/compatible/
* Section 7: The "E" in "ACE encoding" stands for "Encoding". I'd rather
write "ASCII compatible encoding".
* Section 7, 3rd paragraph: "privileged or anti privileged domains". I
haven't the slightest idea what is that supposed to mean.
* Appendix A: Neither here nor in rationale-01, section 13.2 I can find a
requirement for the IANA Contextual Rules Registry to be versioned. It
might be obvious, but it should be made explicit. This versioning must not
necessarily follow from Unicode versioning (one could imagine changes in
it that are not directly bound to Unicode progress). The same goes, btw,
for the derived property registry.
* Appendix A, about U+002D: Typo in the regexp, it should be \u002D
instead of \u00SD
* Appendix A, about U+00B7: Typo in the regexp, it should be \u006C
instead of \u006c
* Appendix A, about U+0375: Typo in the regexp, missing \u for the char
and \p for the script
* Appendix A, about U+02B9: Typo in the regexp, missing \u for the char
and \p for the script
* Appendix A, about U+05F4: Copy&paste typo in the regexp, it should be
\u05F4 instead of \u05F3
* Appendix A, about U+3005: Copy&paste typo in the regexp, it should be
\u3005 instead of \u30FB
* Appendix B: I am not sure of the usefulness of this whole Appendix;
major programming languages support directly Unicode Regexps, and if some
doesn't, the programmer can check widely available documentation.
Regarding anchor41: What part of a construction like "\p(Script:XXX)" is
fairly exotic? And how exotic is it in comparison with the bidi rules or
with the elaboration of the derived property? Keeping Appendix B will lead
to duplication of efforts and chances for inconsistence (for instance,
right at the beginning on the character hyphen-minus: "Must appear [sic]
at the beginning or end of a label"...). Though well-intended, I suggest
dropping the effort of Appendix B.
Best regards,
Marcos Sanz
DENIC eG
More information about the Idna-update
mailing list