Comments on idnabis-rationale-01
Marcos Sanz/Denic
sanz at denic.de
Thu Jul 17 09:50:33 CEST 2008
All,
here follows a healthy mixture of much nitpicking and some more important
comments on rationale-01. All in all, the document has improved a lot
since the last time (pre wg) I read it.
* Section 1.4, about ACEs: "they allow [...] clicking on URLs even though
the domain name displayed is incomprehensible to the user". Sorry, but I
couldn't help laughing at the thought of the draft picturing as a positive
trait the fact of users clicking around on URLs that are incomprehensible
to them (and even more so in the context of IDNs and security, and even
more so right away in the introductory text). So I would suggest a
different, somehow more generic formulation (with the same rationale) like
"they are a last resort that would allow rudimentary IDN usage, for
instance, in case of the necessary fonts not being installed in the
computer of the user".
* Section 1.5.3: s/regardless of that actual administrative arrangements
or level in the tree/regardless of actual administrative arrangements or
level in the DNS tree/
* Section 1.5.3: The sentence starting with "Further, because those
documents were not terribly clear" tries to be a punch on something
(somebody?), but the meaning gets lost without the context. Further, I
don't think the wording is formal enough for a standards document. I
suggest changing it into a simple "Lack of clarity in those documents has
contributed to confusion with these terms".
* Section 1.5.4.1.1, 2nd bullet: "described in RFC 1034, RFC 1123 and
elsewhere". That formulation is not very serious, and certainly not
helpful. I'd go for "described in 952 and 1123.", or maybe expand that
list with 1034/1035, but in any case drop the "elsewhere".
* Section 1.5.4.1.1, 3rd bullet: For the first time appears the concept of
"valid U-labels" and "valid A-labels", but... isn't that a pleonasm? The
current definitions (in the very same section) of A-label and U-label
already require *validity*. Next paragraph tries to clarify "To be valid,
U-labels and A-labels must obey...", but again, that's a constraint that
is implicit in the current definition. So either we have a pleonasm, and
should thus s/valid U-label/U-label/g and s/valid A-label/A-label/g, or
the concept of "invalid [A/U]-label" really exists, in which case it
should be defined (and the definition of [A/U]-label accordingly
modified).
* Section 1.5.4.1.1: "[...] both U-labels and A-labels must represent
strings in normalized form". I thing s/represent/be/ would be technically
more correct. Besides that: What does "normalized" mean here? It should be
precised (NFC?).
* Section 1.5.4.1.1: I cannot parse the sentence starting with "Strings
that do not conform [...]" and ends with "[...] similar resources".
Additionally I don't get to discern in which drawer are being put all
existing domain names with hyphens in the third and fourth position. We
have thousands of those in our zone (many starting with "bq--", you
probably know why). Can you clarify whether those domain names, according
to IDNA2008, "can actually appear in DNS zone files or queries" or not?
Are they (valid) LDH-labels?
* Section 1.5.4.2: "LDH-labels are not IDNs". This sentence is
contradictory with section 1.5.6 "An [...] IDN is a domain name that may
contain any mixture of LDH-labels, A-labels or U-labels. This implies that
every conventional domain is an IDN (which implies that it is possible for
a domain name to be an IDN without it containing any non-ASCII
characters)." It is very important to settle that question, so that the
applicability of IDNA2008 can be perfectly defined (it somehow has to do
with the previous issue).
* Section 1.5.5: s/include the prefix/include the ACE prefix/
* Section 1.5.5: s/output of ToASCII/output of the ToASCII operation/
* Section 1.5.6: I suggest to move the whole paragraph starting with 'An
"internationalized domain name" [...]' under section 1.5.4 (Terminology
Specific to IDNA), maybe between the current sections 1.5.4.2 and 1.5.4.3.
It fits best there.
* Section 1.5.6, still the same paragraph: "such restrictions are
mandatory for IDN registries". I still don't see the point since, like
somebody said on the list, one possible policy could be "we allow the full
variety and complexity of the IDNA2008 protocol including all code points
in all possible valid combinations". In any case, making policy
restrictions mandatory contradicts idnabis-protocol-02, section 4.4, that
states "there SHOULD be policies" (more on that normative language later).
Thus, if anything, I would go for a plain "we recommend such restrictions
for IDN registries" here in section 1.5.6.
* Section 1.5.6: s/"The key words/The key words/
* Section 2.9: Duplicated with 2.8. Suggestion: Drop it.
* Section 3: All the information in that paragraph is already in section
1.4 and, of itself, this section 3 doesn't make any structural point.
Suggestion: Drop it.
* Section 6.1.1, 1st paragraph: s/character in this group/character in
this category/ for nomenclature reasons
* Section 6.1.1: s/right to left/right-to-left/g for coherence with
idnabis-bidi-01
* Section 6.1.1, 2nd paragraph: s/VALID",/VALID"/
* Section 6.1.1, 3rd paragraph: The subordinate sentence starting with
"[...] unless the code points themselves are removed from Unicode [...]"
is irrelevant to practice (deprecated Unicode characters are retained in
the standard) and creates nothing but confusion. Is this again a pun I am
missing? To my eyes, this sentence is a rationale for nothing and is
better left out.
* Section 6.1.1.1, 1st paragraph: invert the order of ZWNJ and ZWJ code
points within the brackets to match the order in the explanatory text.
* Section 6.1.1.1, 2nd paragraph: "Only the former are fully tested at
lookup time", the verbe tense doesn't match the context, it would better
be "should be fully tested".
* Section 6.1.1.2: For the first time 2119-language appears ("MUST NOT
appear in putative labels"), but the prologue of section 6 states "[The
information given in this section] is not normative". I actually recommend
sticking to non-normative language (since this is only a rationale
document) and dropping the capitalization. Btw, what is a "putative
label"? Not defined before and not clear to me...
* Section 6.1.1.2: s/more more/more/
* Section 6.1.2, 2nd paragraph: According to my previous comment on
hypothetical Unicode character removal, I'd just drop this sentence.
* Section 6.1.2, last bullet: s/used to form a letter/used to combine with
a letter/
* Section 6.1.3: s/MUST NOT/must not/ for reasons explained above
* Section 6.2: Again and FWIW, "no restrictions" is also a possible
policy, so I don't see the point. In any case: s/SHOULD/should/ for the
reasons explained above.
* Section 7.2: s/only be exposed to users and in contexts/only be exposed
to users in contexts/
* Section 7.3: Just a naive question, no second meanings: what reasons
speak at the moment *against* including the Eszett in the PVALID list
under the category of exceptions? Thanks.
* Section 7.3, last paragraph: "[...] a registry [...] should give serious
consideration to applying a 'variant' model". Well, the adoption of
"variants" is just a possibility among many others to deal with these
issues (btw, wasn't the "fraud" mentioned later in the sentence explicitly
out of scope for the wg?), so I don't understand why this document should
declare a preference for the variant model now in this context. So please,
change "should give serious consideration" to "could give consideration",
if anything.
* Section 8: "no one has ever seriously claimed that being liberal in what
is accepted requires being stupid". Sorry, but I don't find this
appropriate for a standards text.
* Section 8: "Conversely, resolvers can (and SHOULD or maybe MUST) reject
labels that clearly violate global (protocol) rules". idnabis-protocol-02
section 5.5 however says that they MUST, so we can accordingly update
here. Further, resolvers must reject labels that violate rules, nevermind
if it's done *clearly* or in very subtle ways. Maybe the adverb in the
sentence was in the wrong position. Text suggestion: "Conversely,
resolvers clearly must reject labels that violate global (protocol)
rules".
* Section 8: "If a string doesn't resolve, it makes no difference whether
it simply wasn't registered or was prohibited by some rule". Maybe I am
misreading here, but I certainly think that there's a difference (at least
one rtt). I don't get the point of the statement in this context.
* Section 9, last paragraph: I concur on removing that paragraph, like the
anchor suggests. It's off-topic and no rationale for anything obvious.
* Section 10.1.2, last sentence: "Systems looking up or resolving DNS
labels, especially IDN DNS labels, MUST be able to assume that applicable
registration rules were followed for names entered into the DNS". I can't
figure out why this must be so and why it could be relevant to the
standard in any way, and I even think that this is very brittle from a
security point of view. No, not only brittle: it's actually dangerous. The
right phrasing should be: "Systems looking up or resolving DNS labels MUST
make no assumptions about the data they are going to receive".
* Section 10.1.3, last bullet: "MUST NOT validate other contextual rules
about characters, including mixed-script label prohibitions". There is no
such thing as a general prohibition of mixed-script labels, like the text
might suggest, so I'd just drop the "including [...]" part of the
sentence. The text is not normative anyway.
* Section 10.2: s/they prohibits/they prohibit/
* Section 10.5: This section is duplication with bullets 1 and 2 of
section 10.1.1, so I concur with the suggestion in the anchor36 to
eliminate the whole section. If this were not to happen, here follow
further comments.
* Section 10.5: s/ICANN guidelines/ICANN Guidelines for the Implementation
of Internationalized Domain Names/
* Section 10.5, 3rd bullet: s/the those/those/
* Section 10.5, 4th bullet: s/The actual situation is even worse than
this. //
* Section 10.6: This section is partially duplicated with bullet 3 of
section 10.1.1
* Section 14: For the first time the term "Stringprep2003" appears. Change
it to plain "Stringprep", to be coherent with the rest of the document.
Good work. Thanks for your time, John.
Best regards,
Marcos Sanz
DENIC eG
More information about the Idna-update
mailing list