interoperability testing

John C Klensin klensin at jck.com
Tue Jul 21 21:46:56 CEST 2009



--On Monday, 13 July, 2009 17:55 +0000 Shawn Steele
<Shawn.Steele at microsoft.com> wrote:

>...
> Anyway, that's more of an EAI working group discussion.  I
> wanted to address the comment "email specs specify U-Labels".
> From the IDNA perspective I don't believe that we can assume
> that other standards will specify U-Labels.  For EAI in
> particular I believe that would be the wrong thing to do.  My
> interpretation of the current EAI work is that unmapped
> Unicode is allowed, which I think is required since users
> aren't going to know how to get it off the side of the bus and
> into U-Label form.

Sorry to have taken so long to find this...

Shawn, if one reviews the EAI documents and does so in the
context of email address transport history, the general rule is
that the sending machine is not permitted to make any
assumptions about the behavior of the receiving machine, and
intermediate machines are not permitted to mess with the
addresses (substitution of an alternate address for a primary
one is downgrading is an exception).   This is something else
that needs testing in varied and hostile environments, but...

	* Yes, RFC 5336 permits unnormalized UTF-8 strings in
	both the local part and the domain part. 
	
	* In the domain part, those strings will be passed
	through IDNA2003 ToASCII(), implying that they will be
	both extensively mapped and normalized.
	
	* In the local part, there is no guarantee that the
	receiving server will normalize either its stored
	version of the mailbox name or the mailbox name that
	arrives in the RCPT command.  If it does not, and one
	string is normalized and the other is not, or they are
	unnormalized differently, the mail is not going to get
	delivered.  This is the same principle that is applied
	to, e.g., case variations in local parts: while we
	recommend that case variations in ASCII local parts be
	considered equivalent, there is no requirement that a
	receiving server do that.  For example, if the server
	supports Joe at example.com and joe at example.com as names
	for the same mailbox but does not support
	joE at example.com, the latter is not going to be delivered.
	
	* RFC 5336 quite explicitly requires that any domain
	name comparisons be performed as IDNA[2003] specifies,
	i.e., converted to ACE form and compared in that form.
	I would hope that rule would be followed.

That combination, which I think the WG quite consciously
adopted, arguably creates a much more serious inconsistency in
behavior as seen by the user than the inter-application and
inter-system inconsistencies that have been extensively
discussed on this list.  The user watches what happens with
domain names and concludes that one can be extremely relaxed
about what form is used: unnormalized strings "work",
compatibility characters "work", and so on.  If that turns into
an inference that is then used for the local-part, a lot of mail
isn't going to get delivered because, while 5321 explicitly
recommends (but, again, does not require) case-independent
matching for local parts, 5336 does not even recommend NFC.

And, yes, that is an EAI WG discussion.

    john



More information about the Idna-update mailing list