MAYBE-TRANSITIONAL, a historical tale

Shawn Steele Shawn.Steele at microsoft.com
Tue Dec 8 20:19:59 CET 2009


I’m concerned that this proposal merely obscures the compatibility problem and further delays complete adoption of the characters.

I’ve stuck some comments and emphasis in Mark’s scenarios.

-Shawn

From: idna-update-bounces at alvestrand.no [mailto:idna-update-bounces at alvestrand.no] On Behalf Of Mark Davis ?
Sent: ,  08,  2009 7:57
To: Martin J. Dürst
Cc: idna-update at alvestrand.no; John C Klensin; Erik van der Poel
Subject: Re: MAYBE-TRANSITIONAL, a historical tale
Scenarios
Let's see what happens with fußball.xxx over time, where xxx is some registry (eg .de, .blogspot.com<http://blogspot.com>, or others). Background: essentially all browsers and other major implementations are planning to map for compatibility. We'll look at browsers, but this also applies to email, etc.
Early 2010 (just as IDNA2008 is approved)
At this time the world browsers are 100% IDNA2003

  1.  browsers map fußball.xxx to fussball.xxx.
  2.  registries can start accepting eszett, and *SHOULD* bundle with ss.
Obviously .at & .de will bundle, they’ve said so.  I would fully expect that most of the generic blog/photo/space type lower zones will completely ignore the problem unless it’s shoved in their face, in which case 2 things can happen:

a)      Users can register fussball.somezone.de, and access with the 100% 2003 browsers with fuβball.somezone.de will succeed.  They won’t realize that somezone isn’t acting as recommended.

b)      Users could register fuβball.somezone.de and it won’t work.  They may or may not figure out what’s going on.  Some zones could end up with an inaccessible account.  (The user just paired their email address to xn--something.somezone.de, and since the browser can’t go to http://xn--something.somezone.de/admin, then can’t close the account or anything else.  Customer support may be very difficult.  Yes this has happened.  The only “fix” is to try again with a different user account.  Hopefully the zone would eventually fix this)

  1.  fußball shows up as fussball in the address bar

     *   note: it is only by convention that fussball is seen in the address bar in this case; a browser could also display fußball, as in UTS46.

  1.  results:

     *   if the registry bundles, both fußball.xxx and fussball.xxx go to the same owner.
     *   if the registry doesn't bundle, both fußball.xxx and fussball.xxx go to the same owner.

  1.  The odd IDNA2008 browser that doesn't map just fails, because ß is not PVALID; it doesn't take fußball.xxx to a different location than the vast majority of browsers.

In 2013
At this time the world browsers are 50% IDNA2003, 50% IDNA2008

  1.  same as above. No ambiguity in results.
Note that on zones that do not bundle we still have the same situation:  fussball.somezone.de will appear to work when accessed as fuβball.somezone.de.  Also someone “planning ahead” further than the zone admin could register fuβball.somezone.de planning to hijack the name when 2016 happens.
In 2016 Feb
At this time the world browsers are 1% IDNA2003, 99% IDNA2008

  1.  99% of browsers switch to not mapping fußball.xxx.
  2.  Registries no longer need to bundle; they can have different owners for fußball.xxx and fussball.xxx.
  3.  fußball shows up as fußball in the address bar
  4.  results:

     *   if the registry bundles, both fußball.xxx and fussball.xxx go to the same owner.
     *   if the registry doesn't bundle, fußball.xxx and fussball.xxx go to different owners.

  1.  The odd IDNA2003 browser that is left goes to the wrong location for the affected languages; people that use them need to upgrade.
Now there are suddenly several interesting issues:

a)      Users of .de and .at are probably oblivious since the names are bundled.

b)      Browsers with a bug are finally revealed when German speaking users suddenly go somewhere else (or fail) (and I didn’t even opt-in to today’s update!  And patch Tuesday isn’t on the 1st!  (That one hopefully wouldn’t have a bug, but you get the idea))

c)      The person that registered fussball.somezone.de in early 2010 and observed it “working” with fuβball.somezone.de (so nobody’s ever even used fussball for lookup), suddenly gets broken because somezone isn’t bundling.

d)     The troublemaker who thought ahead and registered fuβball.somezone.de and patiently waited suddenly is able to hijack the site.

I really appreciate that Mark’s trying to figure out a way to solve the problem, and I feel bad since we usually have similar views, but I think that Mark’s proposal has enough problems that it doesn’t really help.  In short, I see a few impacts:


·         For completely compliant registries/browsers, it delays adoption of the new characters for 6 years.  Although people in Germany may “deal”, that seems blocking for the Greek Final Sigma.

·         For .de and .at there’s zero impact (except the delay) over just making them PVALID right now.  .de & .at have already stated their intent to bundle, so the transition syntax isn’t helpful to them.  This is true of any compliant registry/zone operator.

·         If you’re not compliant it seems that there’re two reasons why:

o   You don’t care, in which case any “pain” is just delayed 6 years.

o   You don’t know, in which case any “pain” is also just delayed 6 years.

·         This transition doesn’t help with “caring” or “knowing” because there’s no feedback.  Until 2016, everything still “works”, so the zone operators and end-users will continue to be oblivious, until it breaks in 2016.

·         For knowledgeable people of nefarious intent, there’s an opportunity to discover the problem and figure out a way to take advantage of the 2016 transition date.

·         There’s a potential for bugs on the client side.

·         Currently there are some links that would break by making these PVALID.  Educated zone operators (.de & .at) can fix these without waiting 6 years.  That leaves some number of potentially breaking links from unknowing zone operators.  IMO waiting 6 years merely increases the number of possible breaking links.  So it seems that the break would be *worse* in 2016 than now.

·         For zones that intentionally distinguish between ss and β, this does provide some window of opportunity to get (some, not all of) the legacy browsers out of the system.  Given that the zones that care intend to bundle this seems of marginal utility at best.  Even if it was interesting, they’d still have to bundle until 2016, so the distinction couldn’t be made until then, and by then they’d probably have a large mass of bundled forms.

In short, it seems to me that this merely delays the pain, and complicates the pain points.  I would rather see a clean breaking change in IDNA2008 with no mitigation of the break.  I’d have to think about it and consider what the rest of the community thinks, but IE/Windows may feel like the risk is lower by just treating PVALID_AFTER_2015 as PVALID.

-Shawn

(Rest of Mark’s mail for reference)

Here's a modified proposal, a bit rough yet.

Live page: http://www.macchiato.com/unicode/idna/transition-proposal
Problem
We would like to have the 4 deviation characters be valid, at some point. The key problem is that we don't want current URLs in web pages, etc. to go to two different locations depending on the browser, nor do we want joe at fußball.com<mailto:joe at fu%C3%9Fball.com> to go sometimes to joe at fußball.com<mailto:joe at fu%C3%9Fball.com> and sometimes to joe at fussball.com<mailto:joe at fussball.com>. Even once IDNA2008 is approved, for a long time a majority of the implementations will still be IDNA2003, so this also goes for new label registrations during the transition period.
Proposal
IDNA2008 changes as follows:
The 4 deviation characters get the property PVALID_AFTER_2015

The requirements are:

     *   On registration, PVALID_AFTER_2015 is equivalent to PVALID
     *   On lookup, PVALID_AFTER_2015 is treated as DISALLOWED up until 2016 Jan 1, 00:00:00 GMT, and treated as PVALID thereafter.

        *   Implementations must not map the characters after the switchover date.

     *   Implementations that map the characters before that date, must map as in IDNA2003.

The goal is to

     *   allow the 4 character to become valid, as soon as possible;
     *   avoid  the 'nightmare' scenario of the same URL going to two different locations, as much as possible.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20091208/e277f333/attachment-0001.htm 


More information about the Idna-update mailing list