Charter changes and a possible new direction

Andrew Sullivan ajs at shinkuro.com
Wed Jan 14 05:26:14 CET 2009


Hi,

On Wed, Jan 14, 2009 at 04:56:55AM +0100, Patrik Fältström wrote:
> On 14 jan 2009, at 04.19, Andrew Sullivan wrote:
> 
> > So, I have two questions:
> >
> > 1.  Just how bad is it to tie IDNA to a particular version of
> > Unicode, and why?  (Ok, maybe this is two questions.)
> 
> Mainly because if a new version of Unicode is released, a registry can  
> not use the added codepoints in IDNs until there is a new version of  
> IDNA released.

Yes, I get this, but _how bad_ is that?  Are the users of the code
points since 2003 the ones who are driving the current work?  I don't
have that impression, but since I don't hang out in the circles where
most of the pressure appears to me to be coming from, I'm fully
prepared to be wrong.  My impression is that the people unsatisfied
with IDNA2003 are unsatisfied for other reasons.

Note, too, that it seems to me the current effort is taking a long
time because it's _not_ a straight "fix the tables" effort to bring
IDNA2003 up to date with Unicode.  I'm sympathetic, however, to the
problem that a smaller effort could well just not get adequate review.

> Another reason is that a programmer that implement anything with  
> Unicode can not know what version of Unicode is installed so use of  
> the installed tables is just not possible if you want to be  
> conformant. You have to use explicitly IDNA2003 (today) libraries and  
> not any of the Unicode libraries installed, as there might be  
> incompatibilities (what are unassigned codepoints for example).

Again, how bad is that?  This is the issue I used to be sure was
obviously a big deal, but about which I'm now much less sure.  The
discussion in 4690 sure makes it sound like a big deal, but in the
absence of a complete example I'm just not sure what to think.  

I can think of different ways in which this could be bad:

1.  New (non-IDNA2003) code tries to resolve a domain that is not
legal under IDNA2003.

2.  New code thinks a domain, that _is_ legal under IDNA2003, is not
legal. 

3.  New code resolves a domain differently than truly
IDNA2003-compliant code.

(1) is ugly and annoying, but not actually that harmful.  In the list
of "crap thrown at the global DNS", it's surely down in the noise
category (compared to, say, queries for .lan).  (2) is very bad,
because it says that a domain that could be registered fails to
resolve with the new libraries.  Similarly, (3) says that you get a
different result depending on the libraries involved.  If we have
examples of (2) or (3), however, I'm not aware of them.  I don't think
I've seen a worked example of the issues coming from the normalization
trouble outlined in RFC 4690 section 3.1, so I can't evaluate how
serious the problem is (I'm not therefore dismissing it -- I just
don't know what weight to put on it).

Answering "how bad" here is, I think, very important.  I'm
increasingly distressed at the degree of complication IDNA2008 appears
to be proposing.  For instance, I think there is every reason to
suppose that the "local mapping" approach IDNA2008 is taking (and
which still, I emphasise, appeals to me) opens a wider vector for
phishing problems even than we have with IDNA2003.  We have decided
phishing isn't a problem we can solve, but people are going to be
angry if instead we make it worse.

Moreover, we have noted that registries will need guidance, and that
some of the people making decisions about implementation may not
really be in a position to completely understand the full document
set.  Surely if that's the case (and I don't deny it), a system as
complicated as IDNA2008 is going to encounter some trouble.
Especially since we don't seem to want to add the implementers'
document to our document set.

Finally, we are making what is plainly an incompatible on-the-wire
change by deciding that ß is in, without changing the encoding prefix.
The more I ponder that, the more I think we must have taken leave of
our senses.  (I include myself -- more than anyone else! -- in that
"we".)

These all seem to me to be very big costs.  If we look at this as a
trade-off, is the gain worth costs that big?  I am embarrassed to say
that I really don't know.  Paul's draft, however, has made me stop and
think about these questions, and I'm not at all happy with how I feel
when I think about them.

> The plan that I am pushing for, and to be honest, I have as document  
> editor not heard from the wg chair what the actual consensus is, is  
> that we need protocol action for _any_change_of_the_document_, which  
> implies only if changes to exceptions, backward compatibility and  
> regular expressions. Not if Unicode come with a change that add things  
> that does not require changes to any of those.

Ok, excellent, thanks.  That's indeed a significant advantage to the
IDNA2008 approach.

A

-- 
Andrew Sullivan
ajs at shinkuro.com
Shinkuro, Inc.


More information about the Idna-update mailing list