Mapping (was: Issues lists and the "preprocessing" topic)

John C Klensin klensin at jck.com
Sat Aug 23 00:18:58 CEST 2008



--On Wednesday, 20 August, 2008 09:55 -0700 Erik van der Poel
<erikv at google.com> wrote:

> The current IDNA spec is IDNA2003, and it includes the default
> pre-processing steps. Now, IDNA200X is removing (the details
> of) the pre-processing steps, so it ought to explain this
> major difference between IDNA2003 and 200X. This explanation
> can mention the difference between the "default"
> pre-processing (as seen in IDNA2003 and HTML today) and the
> per-locale UI pre-processing such as Turkish dotted/dotless
> uppercase 'i'. If the WG consensus is to leave out any mention
> of UI pre-processing, that is fine, but I think it is quite
> important that IDNA200X explain that the default
> pre-processing of IDNA2003 has been removed.

I agree and you didn't even need to twist my arm.

But I have a problem on which I need advice (and, ideally,
specific text or instructions) from the WG.   It appears to me
that there are three ways to address the differences between
IDNA2003 and IDNA2008, of which this one and the exclusion of
non-letter/ non-digit characters may be the ones with the
largest impact.

	(1) We explain what IDNA2008 does and ignore IDNA2003
	and the differences entirely.
	
	(2) We explain what IDNA2008 does and note that it is
	different but don't explain why.
	
	(3) We explain the issues with mapping and why we
	decided to go from a "map whenever possible so as to
	include most Unicode characters somehow" logic to a
	"don't map so as to make behavior more understandable
	and U-labels and A-labels convertible to each other
	without information loss" logic.

Rationale and its predecessors started with (3).  I was told to
tear out the text that seemed critical of IDNA2003, with one key
comment being that type of justification may have been important
in getting where we are but that it is no longer important now
that the WG is chartered, etc.   So we are now roughly at (2)
with some of the WG membership still believing that we should be
at (1), i.e., with all explanatory material discarded.

I'm happy to go back toward (3) --especially since I've never
been convinced that removing those explanations was wise-- but I
have two problems.  I don't know how to say "this is different
from IDNA2003 because..." without saying things about the
IDNA2003 strategy that some will construe as critical and
negative.  Others might be able to do better.   Or it is
possible that what we really need to do is to strengthen (2)
without making a "why this is different" comparison.   I've
already done a bit of the latter and hope that I can finish up
the pending drafts and get them posted this weekend (this last
week or so have been terrors for reasons unrelated to IDNA).
But I don't know if the new text will be adequate; if it is not,
I would really appreciate specific suggestions.

And, at some stage, I think we need Vint to see if he can state
a formal consensus call on whether we want "why this is
different" explanations even if those explanations could be
interpreted as critical of IDNA2003.

    john




More information about the Idna-update mailing list