Charter changes and a possible new direction

Tue Jan 13 22:39:15 CET 2009

At 10:15 PM +0100 1/13/09, Patrik Fältström wrote:
>The change from just using a table, to 
>use a series of rules, is a big change that make things very 
>different.

Both proposals have a series of rules. The IDNA2008 proposal has a series of rules to create the table; that set of rules must be run *every time* Unicode is updated. The IDNAv2 proposal has the exact same series of rules that people have gotten used to for IDNAv1.

>That some of the rules are tables (you mention exceptions 
>and backward compatible lists) does not change the fact this 
>algorithmic approach is Unicode independent.

Disagree. The rules are only independent of Unicode versions if the Unicode Consortium does not make any additions or changes that would affect the table.

>Given no drastic changes 
>are made to Unicode in future versions, we will never see any 
>codepoints be added to the backward compatible list.

Even small, non-drastic changes could cause the need for changes to the table; these have been discussed in various threads in the past few months.

>The difference between your proposed approach and IDNA2008 is that for 
>your tables to work, one *have* to update the RFC for every Unicode 
>version.

That's not at all true. Unicode has been updated many times since 2003, and there has been no pressing need to update IDNA for each one.

>Something that is not needed in IDNA2008.

Hopefully true.

>
>> - "The constraints of the original IDN WG still apply to IDNABIS, 
>> namely to avoid disturbing the current use and operation of the 
>> domain name system, and for the DNS to continue to allow any system 
>> to resolve any domain name in a consistent way." If we consider IDNA 
>> to be part of the DNS, then this is no longer true with the current 
>> drafts. In specific, registries that are following the model of 
>> IDNA2003 now must start using registration-binding if they want to 
>> follow IDNA2008 and use European languages such as German or Greek 
>> (and possibly some Arabic languages, depending on the output of the 
>> ASIWG and this WG's adoption of their proposals).
>
>If I do not misunderstand you, I see no difference between IDNA2003 
>and IDNA2008 (or your proposal) regarding this binding that must 
>happen at time of registration. This due to registry policy and 
>language table issues.

Sorry, then you misunderstand me. Under IDNA2003, registries that registered (for example) name in German did not need to keep any name bindings. They will under IDNA2008. More significantly, they will need to add those bindings back to names that are already issued even if those names would make no sense with a eszett/sharp-s.

> > - "This work is intended to specify an improved means to produce and 
>> use stable and unambiguous IDN identifiers." IDNA2008 makes the 
>> current IDN identifiers unstable for German or Greek (and possibly 
>> some Arabic languages, depending on the output of the ASIWG and this 
>> WG's adoption of their proposals).
>
>I strongly disagree with this conclusion of yours.
>
>The statement is, once again: "This work is intended to specify an 
>improved means to produce and use stable and unambiguous IDN 
>identifiers."
>
>This is true as it is making a very big change from IDNA2003, and that 
>is to have a very well defined definition of an A- and U-label.
>
>The problem you point out has to do with the transition from IDNA2003 
>to IDNA2008 and the fact mappings where part of IDNA2003, so that it 
>is unclear whether for example eszett is "ok" (note the citation) or 
>not.

We maybe agree. The A/U label idea was a good improvement, but one that has caused the transition from the current protocol to the new one to cause lack of clarity. In my mind, that makes some of the current labels unstable and ambiguous.

> > Separate from the charter problems, it is also clear that we cannot 
>> meet our original goals of making the update easy to implement. The 
>> original design was based on the idea (that I supported) that an 
> > inclusion-based system would be easier to implement than the mapping-
>> based system in IDNA2003. Over time, that goal clearly became 
>> impossible. We now have a protocol that relies on context-sensitive 
>> and position-sensitive regular expressions.
>
>For specific codepoints, yes. And you will not get away from that if 
>you update IDNA2003. If you want to move forward with your proposal, 
>you have to add exactly the same position dependent rules.

Fully disagree: I see nowhere in the draft that says this.

>FWIW, I have no problems throwing away the document I have been 
>working on, but after being a document author of IDNA2003, implementor 
>of IDNA2003 and various stringprep algrorithms, document author of 
>IDNA2008, I think you take too lightly on how easy it is to update 
>IDNA2003.

Possibly true; that's one of the reasons I actually wrote a draft instead of just floating the idea.

>If you "just" update IDNA2003, you have to:
>
>- Update the future RFC at EVERY update of the Unicode Standard.

Again: why? Why not do it in, say, five year cycles like we are doing now? Where is the demand?

>- Separate the mappings from the actual codepoints that can be used in 
>the DNS, and come up with a terminology for it.

Sorry, now I am misunderstanding you. Please try again (or be more verbose).

>- Fix the Bidi issues that we knew with IDNA2003 that we did not get 
>right (or at all).

I cannot tell whether or not you read the draft. It fixes both of the primary problems that Harald and Cary found. What others do you see as needed?

>- Still have the regular expressions that say what codepoints are 
>valid where.

Disagree. Please show where in the draft those are needed.

>- Still have issues with transition from IDNA2003 to IDNAv2 (as you 
>call it) as there will be incompatibilities.

Where? All issues between a system running the old version and one running the new version are already taken care of with the handling of unassinged code points.

>So I think your document, if that is the basis for future work in this 
>wg, is very very short and to be frank, naive.

Please show where it is too short.

And, I am quite willing to admit that I think if the WG has the choice of "short and naive" and "long and naive", we should pick the former. YMMV.