my comments on draft-ietf-idnabis-protocol-14 (first part)

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Thu Aug 27 14:01:56 CEST 2009


I'm not as positive on this as Andrew. While I think the general ideas 
are ready, the writing is way less clear than IDNA2003, and because this 
is the most important document, we have to make sure this is *way* clearer.


Intro, para 1: I missed the term IDNA2003 in Defs, it would have been 
useful. I didn't complain because I thought it had been 'deprecated'. 
Seeing it here, I think it should go back to Defs, and be actively used 
in Defs and Protocol at least, to simplify and clarify prose.

para 2: "does not changes" -> "does not change"

para 2: "IDNA does not depend on any changes to DNS servers, resolvers, 
or protocol elements" -> "IDNA does not depend on any changes to DNS 
servers, resolvers, or DNS protocol elements" or "IDNA does not depend 
on any changes to DNS (servers, resolvers, or protocol elements)" 
(Otherwise, it's possible to understand 'protocol elements' as being not 
limited to DNS.)

para 4: ", that share some terminology, reference data and operations." 
-> "These two protocols share terminology, reference data and operations."

2. Terminology

para 1: "Terminology used in IDNA, but also in Unicode or other 
character set standards and the DNS, appears in [IDNA2008-Defs]." -> 
"Terminology used in IDNA appears in [IDNA2008-Defs]." (where else these 
terms are used, or where they are from, can be explained in Defs where 
necessary, but is absolutely irrelevant here)

para 1: "Terminology that is required as part of the IDNA definition, 
including the definitions of "ACE", appears in that document as well." 
-> remove (first, the word 'definition' is used with two slightly 
different meanings, and second, I don't see the point of singling out 
"ACE".)

3.1, Requirements
Requirement 2: Equivalence is already defined in Defs. Please make sure 
there is only a single definition.

Requirement 2: Why is it a MUST that U-Labels are compared without 
case-folding (even for ASCII chars?) or other steps?

Requirement 2: "In many cases, validation may be important for other 
reasons and SHOULD be performed.": Is this restricted to when trying to 
compare? Or in general?

Requirement 3: This does double duty, and should be removed. The 
alternative is to covert 3.1 into a conformance section as usual e.g. 
for ISO standards, but then a lot more rewriting will be necessary in 
all of 3.1.

3.2 Applicability

para 1: "IDNA applies to": What does "applies to" mean?

para 2: "Because it uses the DNS, IDNA applies" -> "Because IDNA uses 
the DNS, it applies", or even better "Because IDNA uses the DNS, IDNA 
applies" (repetitions don't hurt in standards, reference before referent 
does)

para 2: "unless those protocols and implementations of them" -> "unless 
those protocols and their implementations"

para 2: "be aware of IDNs in Unicode" -> "be aware of IDNs" (whether 
they are in Unicode or not is irrelevant here)

3.2.1, para 1: The first paragraph reads as if IDNA applied to domain 
names in e.g. TXT records in CLASS IN. I think it would help here to say 
exactly what is meant by "IDNA applies". In some sense, IDNA applies 
nowhere in DNS records, they are all just ASCII. In some sense (labels 
starting with xn-- are presumed to be IDNA labels; you can add an IDN 
(or a label thereof) to a DNS record by using A-labels), IDNA applies.

3.2.1, para 2: The SVR discussion has significant overlap with Defs, 
please reduce.

4. Registration Protocol
para 1: "defines *the* procedure" ... : This would work better if there 
were really only one procedure, and it were written as a procedure. 
However, there are often variations, and different, often non-procedural 
ways in which things are expressed (e.g. 'labels must ...' instead of 
'if a label doesn't satisfy x, abort')

para 2: "the registration and lookup protocols (Section 5)" -> "the 
registration protocol (this section) and the lookup protocol (Section 
5)" (shortcuts are the enemies of specifications)

para 2: "while ... are very similar in most respects, they are 
different" -> "while ... are very similar in most respects, they are not 
identical"

para 2: "follow the appropriate steps": appropriate appeals to value 
judgement, which isn't adequate here.

4.1, title: Why suddenly "Process" instead of "Procedure"? Why not just 
"Input"? And why singular in the title, and then plural in the first 
line of the text?

4.1: "are outside the scope of these protocols": How many protocols are 
there? Only one that's relevant here.

4.1: Why is NFC a condition on the input? Please make it a validation 
step afterwards, to streamline things.

4.1: "Entities responsible for zone files ("registries") are expected to 
accept only the exact string for which registration is requested, free 
of any mappings or local adjustments.": It's clear to me what we want 
here, but it's much better to write this as a condition on the later 
processing, rather than on input, something like: "Entities responsible 
for zone files ("registries") MUST NOT apply any mappings or local 
adjustments of any kind to the exact string for which registration is 
requested."

4.1: "They SHOULD avoid any possible ambiguity by accepting 
registrations only for A-labels, possibly paired with the relevant 
U-labels so that they can verify the correspondence."
This has to be improved. First, the SHOULD doesn't belong on the reason, 
and the reason, if anywhere, belongs at the end. Second, there are three 
possible ways input can come in, so let's list things up:
"Entities responsible for zone files ("registries") MAY accept input in 
any of three forms:
1) As a pair of A-label and U-label
2) As an A-label only
3) As an U-label only
1) and 2) are RECOMMENDED because the use of A-labels avoids any 
possibility for ambiguity.
(the first sentence in 4.2.1 can then be removed)

4.2.1: This is a complex jungle of conditions on input, conversions,...
What should be done is:
a) extract the 'raw' (without any preconditions) U-label->A-label and 
A-label->U-label 'functions' into subsections e.g. in Section 3; these 
will serve as building blocks both in Section 4 and Section 5.
b) As the first step of the registration procedure, make sure we have 
both an A-label and an U-label. One way to write this is:
"4.2.1: Preprocessing
1) If the input contained an A-label and a U-label, check that they are 
equivalent (or whatever that was called; the conditions are somewhere in 
Defs). If the check fails, abort registration.
2) If the input contained an A-label, but no U-label, calculate the 
U-label according to @@@.
3) If the input contained an U-label, but no A-label, calculate the 
A-label according to @@@."
The above makes sure we have both an A-label and an U-label from here 
on. Checking on these can be performed independently (e.g. length check 
on A-label, NFC check on U-label). Conversion to punycode is no longer 
needed in 4.4, because we simply put the A-label we have now into the 
zone (assumed we have passed all the checks up to here, of course).

4.2.1: (probably not needed anyway) "both the A-label form and the 
U-label one" -> "both the A-label form and the U-label form"

4.2.1: Word the A-label checks more clearly, and create section "4.2.2 
A-label Validation"

4.2.3.2: "a combining mark or combining character" -> "a combining 
character" (combining marks are a special case of combining characters, 
and as such irrelevant here)

4.2.3.3: "To check this, each code-point marked as CONTEXTJ and CONTEXTO 
in [IDNA2008-Tables] MUST have a non-null rule." Is this a requirement 
on Tables? Are there "null rules"? What purpose do they serve, what's 
the difference between them and DISALLOWED?

4.2.3.4: What are "characters written from right to left"? Either we 
define this clearly here, or we leave it (or put it) in Bidi, but then 
we have to rewrite the sentence here (just requiring conformance to the 
conditions in Bidi).

4.2.4: This is totally unnecessary, please remove. If we need a summary 
for what's essentially just about a page of text, we better give up.

4.3: "Policies are likely to be informed by the local languages" -> 
"Policies are likely to be informed by the local scripts and languages" 
(IDNs are mostly a script issue, much less a language issue. ICANN has 
fixed their documents to avoid only talking about languages (they still 
could move a bit further to scripts), so let's not commit the same 
mistake here again.)

4.3: "or the application of special restrictions to others": like what? 
Like that such a label can only be resolved on Tuesdays?

4.4: The generic parts of the conversion need to go somewhere else 
(Section 3?). The actual conversion (or checking) needs to go at the 
start of 4.2. Then this section is empty and can be removed.

4.5: "The A-label is registered in the DNS by insertion into a zone." -> 
"The label is registered in the DNS by inserting the A-label into a 
zone." (distinguish registration of the abstract thing from insertion of 
the concrete thing)

More hopefully tomorrow.   Regards,    Martin.

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp


More information about the Idna-update mailing list