MAYBE-TRANSITIONAL, a historical tale

Elisabeth Blanconil eblanconil at gmail.com
Tue Dec 8 20:17:01 CET 2009


Dear Mark,
This is great!  This is like organizing three Y2K!!! Very exciting.

Some suggestions. Some may seem odd, but we need to be practical. Europe has
started registering IDNs a few days ago. Fast Track members will start in a
few weeks. So we have to make sure they should support their non-IDNA2003
registrants, or they should wait for 2015.

- untill 2013 IDNA will continue to resolve IPv4 addresses but from 2015 on
it will only resolve IPv6.
- it will partly be French compatible in 2013 and will provide full French
support from 2015.
- the cdTLD (characters deviant TLD) ICANN committee should supervise
applications compatibility with the the prevailing temporary standard.
- no TLD will be formed in using deviation characters before 2019 in order
to allow the remaining 1% population not to be over embarassed.
- Greek, Macedonian, Austrian, German, Helevetic, Luxemburghese,
Lichensteinese, European, Polish, French, Deutch, Belgium, Governments and
Administrations should be told to make sure that e-government forms adapt to
the UTP (Unicode Transition Plan). Same for banking security applications.
Same for every application using some kind of IRI.
- to make hundred millions of people aware at low cost, we could propose the
EU Multilingualisation Commissary to make 2013 and 2015 the SSS Sigma &
Eszett years.

We also have to find a way to police the transition, otherwise we may have
many problems (I am sure ICANN is to commission a 2 million dollars study on
their identification/evaluation).

Elisabeh Blanconil

2009/12/8 Mark Davis ☕ <mark at macchiato.com>
>
> Here's a modified proposal, a bit rough yet.
> Live page: http://www.macchiato.com/unicode/idna/transition-proposal
>
> Problem
>
> We would like to have the 4 deviation characters be valid, at some point.
The key problem is that we don't want current URLs in web pages, etc. to go
to two different locations depending on the browser, nor do we want
joe at fußball.com <joe at fu%C3%9Fball.com> to go sometimes to
joe at fußball.com<joe at fu%C3%9Fball.com>and sometimes to
joe at fussball.com. Even once IDNA2008 is approved, for a long time a majority
of the implementations will still be IDNA2003, so this also goes for new
label registrations during the transition period.
>
> Proposal
>
> IDNA2008 changes as follows:
>
> The 4 deviation characters get the property PVALID_AFTER_2015
> The requirements are:
>
> On registration, PVALID_AFTER_2015 is equivalent to PVALID
> On lookup, PVALID_AFTER_2015 is treated as DISALLOWED up until 2016 Jan 1,
00:00:00 GMT, and treated as PVALID thereafter.
>
> Implementations must not map the characters after the switchover date.
>
> Implementations that map the characters before that date, must map as in
IDNA2003.
>
> The goal is to
>
> allow the 4 character to become valid, as soon as possible;
> avoid  the 'nightmare' scenario of the same URL going to two different
locations, as much as possible.
>
> Scenarios
>
> Let's see what happens with fußball.xxx over time, where xxx is some
registry (eg .de, .blogspot.com, or others). Background: essentially all
browsers and other major implementations are planning to map for
compatibility. We'll look at browsers, but this also applies to email, etc.
>
> Early 2010 (just as IDNA2008 is approved)
>
> At this time the world browsers are 100% IDNA2003
>
> browsers map fußball.xxx to fussball.xxx.
> registries can start accepting eszett, and should bundle with ss.
> fußball shows up as fussball in the address bar
>
> note: it is only by convention that fussball is seen in the address bar in
this case; a browser could also display fußball, as in UTS46.
>
> results:
>
> if the registry bundles, both fußball.xxx and fussball.xxx go to the same
owner.
> if the registry doesn't bundle, both fußball.xxx and fussball.xxx go to
the same owner.
>
> The odd IDNA2008 browser that doesn't map just fails, because ß is not
PVALID; it doesn't take fußball.xxx to a different location than the vast
majority of browsers.
>
> In 2013
>
> At this time the world browsers are 50% IDNA2003, 50% IDNA2008
>
> same as above. No ambiguity in results.
>
> In 2016 Feb
>
> At this time the world browsers are 1% IDNA2003, 99% IDNA2008
>
> 99% of browsers switch to not mapping fußball.xxx.
> Registries no longer need to bundle; they can have different owners for
fußball.xxx and fussball.xxx.
> fußball shows up as fußball in the address bar
> results:
>
> if the registry bundles, both fußball.xxx and fussball.xxx go to the same
owner.
> if the registry doesn't bundle, fußball.xxx and fussball.xxx go to
different owners.
>
> The odd IDNA2003 browser that is left goes to the wrong location for the
affected languages; people that use them need to upgrade.
>
> Mark
>
>
> On Tue, Dec 8, 2009 at 01:44, "Martin J. Dürst" <duerst at it.aoyama.ac.jp>
wrote:
>>
>> I agree with Mark that while there are similarities between MAYBE and
TRANSITIONAL, there are also huge differences.
>>
>> One difference, which Mark has mentioned, is the number of characters
affected.
>>
>> A second difference is that there would only be one transition from
TRANSITIONAL to PVALID, not a series of transitions from MAYBE to PVALID.
>>
>> A third difference is that MAYBE was essentially saying "we don't have a
clue now, we may have later". In my understanding (I didn't participate in
any meeting), one of the main reasons brought by the Unicode side against
MAYBE was that if it's MAYBE, we can as well look at the thing and decide
now. For TRANSITIONAL, we may know exactly what we want to do, it just
doesn't fit into PVALID and DISALLOWED.
>>
>> BTW, I don't think that any of the dynamic lookup schemes proposed by
Andrew or Eric are feasible, they quite are simply overengineered. We need
something much simpler, even if this temporarily goes against user
convenience.
>>
>> Regards,   Martin.
>>
>> On 2009/12/05 6:45, Mark Davis ☕ wrote:
>>>
>>> I agree with you that there are many similarities between the MAYBE and
>>> TRANSITIONAL. MAYBE at the time wasn't suitable because it was applied
to a
>>> huge number of characters. However, applying the concept (with a few
>>> changes) to these 4 characters for a transitional period is, I think,
>>> feasible.
>>>
>>> Mark
>>>
>>>
>>> On Fri, Dec 4, 2009 at 12:40, John C Klensin<klensin at jck.com>  wrote:
>>>
>>>> Once upon a time, not really that long ago, there was a proposal
>>>> to differentiate what is now PVALID by including MAYBE YES and
>>>> MAYBE NO categories.   Anyone interested should try to find a
>>>> copy of draft-klensin-idnabis-issues-06.txt and earlier.  The
>>>> general model, in today's vocabulary, was to put characters (and
>>>> groups of characters) that we weren't sure about into categories
>>>> that would encourage different handling on registration and
>>>> looking from characters about which we were more certain, to
>>>> permit later reclassification, and to arrange for controlled
>>>> transitions.  There was consensus for removing those categories
>>>> because they made things too fragile, because they would require
>>>> that all registries and applications check for updates and
>>>> changes frequently (which would be too fragile), and so on.
>>>>
>>>> In practice, the only real difference between MAYBE and the sort
>>>> of implied TRANSITIONAL you imply (or the explicit versions
>>>> others have suggested) is that MAYBE would have laid out the
>>>> "this is likely to change" aspect of the situation more clearly,
>>>> while the idea you outline above raises all of the issues that
>>>> the WG has discussed about transitions from DISALLOWED to PVALID
>>>> (and decided that reclassification should require a catastrophic
>>>> situation).
>>>>
>>>> If I remember correctly, both you and Mark were at the meeting
>>>> at which the decision to drop MAYBE was made and were among
>>>> those pushing for that decision, pretty much on the basis
>>>> outlined above.
>>>>
>>>> While I don't object to revisiting that general idea -- under
>>>> the identification of TRANSITIONAL or otherwise-- if the WG
>>>> really feels that it wants to go there and that the old model
>>>> might be worth the aggravation that caused it to be dropped the
>>>> last time around, I hope that everyone does understand that
>>>> TRANSITIONAL, as you and others have described it, is very close
>>>> to that old and discarded idea... close enough that we might
>>>> even be able to borrow text from documents that are now more
>>>> than 18 months old.
>>>>
>>>> best,
>>>>   john
>>>>
>>>> p.s. I'm not going to comment at any length on the "global
>>>> mappings" part of your proposal because I think everything has
>>>> been said already.  Having required global mappings is
>>>> equivalent to _almost_ having U-label<->  A-label symmetry.
>>>> And, of all mappings, "map to nothing" is the worst: while part
>>>> of the problem with a mapping between "ß" and "ss" is that one
>>>> cannot tell by looking at "ss" afterward whether the registrant
>>>> intended "ss" or "ß", one at least knows that "x" or "ab" was
>>>> not intended.  With "map to nothing", the character that was
>>>> eliminated could, in principle, have appeared in any position in
>>>> any domain name label.
>>>>
>>>>
>>>>
>>>> --On Friday, December 04, 2009 04:11 -0800 Erik van der Poel
>>>> <erikv at google.com>  wrote:
>>>>
>>>>> Here is another proposal that is dead simple, yet allows
>>>>> implementations to take advantage of a machine-readable file,
>>>>> and does not involve "flag days" (dates at which we change
>>>>> something).
>>>>>
>>>>> Instead of having a machine-readable file at each host, we
>>>>> have two global files at iana.org. One file is similar to
>>>>> Patrik's table with entries like:
>>>>>
>>>>> 00DF       ; DISALLOWED  # LATIN SMALL LETTER SHARP S
>>>>> 03C2       ; DISALLOWED  # GREEK SMALL LETTER FINAL SIGMA
>>>>> 200C       ; DISALLOWED  # ZERO WIDTH NON-JOINER
>>>>> 200D       ; DISALLOWED  # ZERO WIDTH JOINER
>>>>>
>>>>> There is no new value called TRANSITIONAL. The infamous 4
>>>>> characters (above) start with the value DISALLOWED. Later, we
>>>>> change them to PVALID (or CONTEXTJ for 200C/200D). We
>>>>> encourage ICANN to redelegate TLDs the registries of which
>>>>> flout our rules.
>>>>
>>>>> The other file is for global mappings. Not language-specific
>>>>> mappings. The format might be similar to RFC 3454's:
>>>>>
>>>>> 0041; 0061; Case map
>>>>> 00AD; ; Map to nothing
>>>>>
>>>>> The absence of a character from this file means that there is
>>>>> no mapping for that character. It maps to itself. The infamous
>>>>> ...
>>>>
>>>> _______________________________________________
>>>> Idna-update mailing list
>>>> Idna-update at alvestrand.no
>>>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Idna-update mailing list
>>> Idna-update at alvestrand.no
>>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>> --
>> #-# Martin J. Dürst, Professor, Aoyama Gakuin University
>> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp
>
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20091208/c6e42a23/attachment-0001.htm 


More information about the Idna-update mailing list