SASLprep200x

Fri Jan 5 16:06:39 CET 2007

--On Thursday, 04 January, 2007 17:49 -0800 Paul Hoffman
<phoffman at imc.org> wrote:

> At 7:05 PM -0500 1/4/07, John C Klensin wrote:
>> While I agree, I really don't see a problem here and, to the
>> extent to which there is an issue, I think Simon has
>> identified the way out.  At the moment, we have a base list
>> of rules from which to select in Stringprep.  One of our
>> goals, IMO a necessary one, is to make Stringprep as nearly
>> as possible Unicode-version-agnostic.  But doing that doesn't
>> have any inherent impact on either IDNA or SASL or even on
>> what Stringprep does and does not permit (at least within the
>> Unicode 3.2 context and conceptually more generally).
> 
> Strongly disagree. What Simon is asking for (as few character
> prohibitions as possible to aid SASLprep) is inherently
> against the basis of the -bis effort, which is to start with a
> more limited defensible set. If StringprepBis has a
> significantly larger set of characters than NamePrepBis,
> NamePrepBis will become a convoluted set of subsetting rules.
> That does not serve the DNS community in the least.
>...

I think we are not disagreeing, but not communicating.  In our
end result, we need to, somehow, accommodate at least three
different applications and whatever requirements they produce:

	(i) IDNs
	(ii) Identifiers to be used to name certificates and
	other security credentials
	(iii) Passwords and other strings that benefit from high
	entropy.

I suspect there are, or are going to be, others, including
generalized Unicode text strings (the Net-Unicode work) and
probably some best practices guidelines for IRI tails.   Each of
these is going to have different requirements, or at least
different optimality points.  Of the three above, I suspect the
first two are much more closely related to each other than
either is to the third, but the first two are almost certainly
not the same.   If one took Simon's argument to the extreme (I
think Martin's comments explain why one should probably not do
so), then (iii) would come out fairly close to "generalized
Unicode text"... and my expectation of that is that it will end
up very close to "use NFC", not anything complicated in
stringprep (and perhaps not touching stringprep at all).

It is possible that Simon's real concern lies in conflation of
(ii) and (iii) either historically or going forward but, if it
is, that is strictly a conversation to be had in the SASL
community, and the security community more generally: it is Not
Our Problem and introducing it here can only delay useful work
in both areas.

It would be stupid to distort any of these target applications
to meet the perceived needs of any of the others.  We would
never get this effort finished if we tried to guess at, and
impose, requirements on any of the others as a consequence of
what we think IDNA needs.  But, as long as several of them
depend on the same underlying Stringprep model, it would be
silly to pretend that none of the other applications exist.

For the present, I think we need to concentrate on what results
we need and then figure out how to implement them with maximum
explanatory power. "Maximum explanatory power" is almost
certainly the same thing as a minimum of convoluted rules for
either subsetting or matching) and that, in turn, is a predictor
of good interoperability.   

I am, for example, resisting alternate Hangul forms because I
fear that introducing special character mappings may complicate
the model and create convoluted rules that implementers who
don't understand the subtleties of Korean won't get right (or,
worse, won't care whether they get right or not).   If it is
necessary, it is necessary, but I hope it isn't.

Suppose a key result of our work is a Unicode property for
"IDN-suitable" and there are almost no mappings except for case
mappings in the scripts that _clearly_ have case distinctions
(not more subtle presentation distinctions).  I think several of
us are expecting just that at this point, although we could be
wrong. Then, if we preserve the current IDNA-> Nameprep->
Stringprep model, Stringprep would need to be modified to
contain an additional section that reflects that property and
model only.  I wouldn't expect that property (or that Stringprep
section) to be used by anything else (although, if it proved
useful to others, so be it).  That would leave the rest of
Stringprep essentially unchanged and useable in whatever way is
now, or became, suitable for other applications.  Since we will
not have restricted them in any way, or changed their rules, Not
Our Problem.  And, in this specific case, if Simon wants or
needs different rules for passwords, that is a discussion he
needs to have with the SASL folks, not with us (which I believe
is exactly what you are arguing).

> So far, no one has shown that the greater character
> restrictions we want to bring to IDNs would have any
> appreciable effect on passwords under SASLprep. Instead of us
> breaking our backs for them, I propose that we move forward
> with our current thinking and, if the SASL community feels too
> restricted by our output, they can make a very simple fork of
> SASLprep that says "for passwords, what they said, plus the
> following characters".

I hope that statement would not turn into a "convoluted set of
subsetting rules" for them, but yes, that seems to me to be
exactly the right strategy.

      john