Draft:  draft-ietf-ldapbis-strprep-06
Reviewer: Harald Tveit Alvestrand [harald@alvestrand.no]
Review Date: January 6, 2006
LC Date: 12-21-2005

Summary: Almost ready, at least one nit needs fixing

Review:
-------
This review was done during Last Call.

This document seems competently written and clear enough to permit 
implementation. Good work!

The behaviour of normalization for substring matching in 2.6.1 leaves me 
sad that the world is this baroque, but Appendix B gives a fairly good 
explanation.

There is one technical issue and one formal issue that need addressing; the 
rest of the comments here are nits.

Technical issue:

Section 2.1 "Transcode" says:

  TeletexString [X.680] values are transcoded to Unicode.  As there is
  no standard for mapping TelexString values to Unicode, the mapping is
  left a local matter.

This is confusing, since there is actually an X.409(1984) construct named 
"TelexString". None have been seen for many a year, and never (AFAIK) in 
conjunction with X.500, so it's likely that this is supposed to be 
"TeletexString". Still, the fact that it's a valid ASN.1 construct means 
that it's a technical issue, not just a spelling error....

Format/formal issues:
---------------------
The references section says:

6.1. Normative References

....
  [StringPrep]  Hoffman P. and M. Blanchet, "Preparation of
                Internationalized Strings ('stringprep')",
                draft-hoffman-rfc3454bis-xx.txt, a work in progress.

While there are other normative references to I-Ds, these are to -zeilenga- 
and -ldapbis- documents, which presumably the author has some control over.
This one seems to be a normative reference to an *expired* I-D, which is a 
Bad Thing. (-02 was published in April 2004, and has been expired for 1.5 
years). Please consider whether it's possible to refer to RFC 3454.

Of lesser importance:

[RFC1345] is listed under informative references, but is not referred to. 
Given my opinion on RFC 1345, that's a Good Thing. Please remove.

Appendix A claims to be "normative", but the reference to it in section 
"Conventions and Terms" says that it's derived from Unicode data. I think 
it would be better if Appendix A said "This data is derived from the 
Unicode 3.2 data files by listing all characters with the Mn, Mc or Me 
properties. It is reproduced here for convenience." In that case, it's 
clear what to do if there should ever be a conflict between the two - 
Unicode rules.

Nits/suggestions for clarification (should be ignored unless a revision 
needs to be done anyway):

Given the baroqueness of section 2.6.1, it may be wise to include the 
standard disclaimer somewhere in the document about implementation: "Note 
that this specification is used to describe the  outcome of the matching 
rule. It is not required that an implementation follow this exact sequence 
of steps, as long as the result is identical in all cases to the result of 
following these steps."

Appendix B took me a little while to read. It's been long enough since I 
last worked with the LDAP matching syntax that a sentence saying "In the 
following, the expression (CN=A*B*C) means that A has to match the 
beginning of the string, B has to match somewhere in the middle of the 
string, and C has to match the end of the string" would have helped me.