my comments on draft-ietf-idnabis-protocol-14 (second part)
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Tue Sep 1 08:30:38 CEST 2009
(second part of my comments)
Section 5:
para 2: " The two steps described in Section 5.2 are required.":
Superfluous. Make sure there's a MUST at the right place in that
section. (Looking at 5.2, I have no clue what the two steps should be.
This shows that indirect requirements like the above are rather unhelpful.)
5.1, first paragraph: Although IDNs will often get extracted from IRIs
or URIs, there are many cases where these constructs are not involved.
Examples would be telnet or ping commands, and so on. So IRIs and URIs
should be deemphasized more.
5.1: "Processing in this step and the next two are local matters, to be
accomplished prior to actual invocation of IDNA.": Again, which steps?
Before, we supposedly had two steps in 5.2, now it looks as if we are
talking about 5.2 and 5.3 as two steps. -> Create a subsection such as
"Input preparation" or what where all the preliminary stuff goes in.
Alternatively, talk about subsections, with subsection numbers for clear
identification.
5.2: "is not already Unicode" -> "is not already in Unicode" (in
parallel to 'into' in the line before)
5.2 "A Unicode string may require normalization as discussed in Section
4.1.": There is no "discussion" in 4.1 (and no need for discussion).
Express the requirements here independently of Section 4.
5.3: (just checking) "See the Name Server Considerations section of
[IDNA2008-Rationale] for additional discussion on this topic.": From the
context, Name Server doesn't look related (we are client-side here).
5.3: "That conversion and testing SHOULD": Replace 'That' with something
clearer and more precise.
5.3, para 2: List up the alternatives that are possible. Avoid mishmash
textual paragraphs.
5.4, para 1: Mishmash again. Most of this para is best removed.
5.4, para 1: "Putative labels": Both in Section 4 and 5, labels are for
the most part putative, because they don't conform to the definitions
unless checked. Either before section 4, or once at the start (Input
subsection) of both section 4 and section 5, say that for the most part,
we are dealing with putative labels, but 'putative' isn't repeated all
the time to make the text easier to read.
5.4, page 12: Finally a bullet list. I almost thought that the author
didn't know how to create bullet lists, or was of the opinion that
bullet lists don't have a place in spec. Quite to the contrary, please
make sure there are much more bullet lists. It will make everything much
easier to read and clearer.
5.4: "Labels that are not in NFC form as defined in [Unicode-UAX15].":
There is only one definition of NFC, but the sentence suggests there are
several. Please change to "Labels that are not in NFC [Unicode-UAX15]."
5.4: Please move bullet 1 (UNASSIGNED) and bullet 4 (DISALLOWED) and all
the other table-related bullets together. I think it's best to put
UNASSIGNED last (and mention that this is the category most subject to
change).
5.4: Streamline the wording used to refer to Tables and a category.
Currently, we have:
in the UNASSIGNED category of [IDNA2008-Tables]
in the "DISALLOWED" category in the permitted character table
[IDNA2008-Tables]
that are identified in [IDNA2008-Tables] as "CONTEXTJ"
5.4: "Labels whose first character is a combining mark (see Section
4.2.3.2).": Refer directly to the relevant Unicode definition, rather
than to section 4.2.3.2 (which contains a MUST, which is already
implicit here).
5.4: "In any event, lookup applications should avoid attempting to
resolve labels that are invalid under that test.": Remove. We already
have a SHOULD, no need for a should on top of that.
5.4, last para: I assume this is e.g. about labels with mixed
scripts,... What it essentially seems to say is that a browser may warn
users if it detects mixed scripts, but if the user still wants to see
the page, s/he is entitled to it. In such a context, the word 'validity'
seems quite a bit out of place; it would be better to speak about 'other
tests' or some such in a more general way.
5.5, para 1: "using the Punycode algorithm (with the ACE prefix added)":
The parenthetical seems to suggest that addition or not of the ACE
prefix is an (optional) part of the Punycode algorithm, but RFC 3492
does not define the prefix, nor is the additon of the prefix part of the
punycode algorithm. -> Convert parenthetical to a clause or sentence
("... and then adding the ACE prefix." or so).
5.5, rest from second sentence in para 1: As said in my comments on
Section 4, a summary is unnecessary. Also, it has nothing to do with
punycode conversion. In addition, the second bullet point is confusing,
because an A-label (checked or not) cannot be punycode-converted again.
-> remove
5.6: "That ... string" -> "The string resulting from the conversion in
Section 5.5"
5.6: "That lookup" -> "The lookup"
5.7: What about (streamlined):
Security Considerations for this version of IDNA are described in
[IDNA2008-Defs], except for the special issues associated with right to
left scripts and characters, which are discussed in [IDNA2008-BIDI].
8./9.: These should be merged. The text explains it all.
8.: "Hoffman and Costello ... should not be held responsible for any
errors or omissions.": Remove, this is implicitly clear, in the end it's
the WG and the IETF that's responsible. Similar for "As is usual with
IETF specifications, while the document represents rough consensus, it
should not be assumed that all participants and contributors agree with
all provisions."
References [Unicode-RegEx], [Unicode-Scripts], [Unicode-UAX15] (and
maybe others): Unicode data files don't have explicit authors, but
Unicode TRs (and similar stuff) has authors/editors, same as RFCs.
Please don't drop this information.
Regards, Martin.
--
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp mailto:duerst at it.aoyama.ac.jp
More information about the Idna-update
mailing list