John C Klensin
klensin at jck.com
Wed Apr 8 21:41:07 CEST 2009
--On Wednesday, April 08, 2009 20:34 +0200 Cary Karp
<ck at nic.museum> wrote:
>> Under IDNA2008, with the way the documents are today, it is an
>> M-label because the "ß" character is PVALID.
>> What causes you to think otherwise?
> I was commenting on the inapplicability of the new term to the
> full battery of codepoints that are remapped under IDNA2003
> (which removes ß from the IDN space entirely). Since we
> haven't discussed any other mapping rules, and ß appears to
> be the focus of particular concern on this list, I suppose I
> was looking for some reassurance that we are beyond
> questioning its PVALID status in IDNA2008.
Aha. I'm assuming that any decision we have made already is
made, at least absent new evidence or new logic that it should
be un-made (the issue of mapping rules and appropriate of the
Katakana Middle Dot is, IMO, in the "new evidence" category).
That means that IDNA2003 mapping rules don't get used, except
possibly in fallback with a second lookup, to prevent anything
specified in the current IDNA2008 specs from working as IDNA2008
expects them to work, i.e., that IDNA classifications of
characters as PVALID or DISALLOWED supercede anything done in
IDNA2003 as far as mapping or anything else is concerned.
I also agree with Martin that, except possibly as a
second-lookup fallback, we should not be mapping anything that
isn't obviously reasonable for inclusion in an domain name
The above assumes something that I don't think has been
explicitly discussed, which is that there are two ways to do
this "lookup mapping" job:
(1) View it strictly as a backward-compatibility mechanism, to
be applied only if IDNA2008 lookup (with no mapping) fails.
That guarantees two lookups for any string that is valid under
IDNA2008 but not found and that is at least syntax-valid under
IDNA2003. Even here, the mappings that are permitted need to be
designed so that the targets of the mappings don't un-do
(2) Try to devise a new mapping table that can be applied before
conversion to A-label form and actual DNS resolution. This
would have to be an entirely new mapping table constructed more
or less along the lines Martin has outlined -- lower-case, width
corrections, and a few other things, but not fonts, superscripts
or subscripts, boxed or circled characters, etc.
IMO, the latter is almost certainly the right thing to do if (i)
we have accepted "mapping forever" and (ii) "mapping as a
fundamental part of lookup-side IDNA", rather than something
that exists as part of some normalization step that exists
outside IDNA itself (and that is not necessarily the
responsibility of this WG).
> I also feel it might be useful if the descriptive framework
> we're developing for IDNA2008 is, itself, backwards compatible.
Sure. But, again just IMO, I'd rather get finished and then
have those who are interested take on the job of producing a
(probably-Informational) RFC describing the relationship between
IDNA2008 and IDNA2003 terminology. Otherwise I fear we will
spend even more critical-path time getting the details right,
delaying getting finished even further.
More information about the Idna-update