Tables: BackwardCompatible Maintanence

Wed Dec 10 02:40:57 CET 2008

Patrik said:

> As Editor, I would like to get more input.

Mark had indicated the general scope of this on November 19:

<quote>
1. There is no change to the algorithm.

2. Implementers already must get the context table from the IANA site in
order to implement IDNA2008. This would just mean that they'd also have to
get the Backwards Compatibility table at the same time.

3. Instructions to IANA would be roughly:
a. At each new version of Unicode, regenerate the tables (we already want
them to do this).
b. If any character that was PVALID according to the previous table becomes
not PVALID, then add it to the Backwards Compatibility table.
</quote>

Now, to help this along (I hope), I will propose below
*exact* changes to the current text of idnabis-tables-04.txt
which would accomplish what Mark has in mind (and which I
agree with).

=============================================================

Section 2.7. BackwardCompatible (G)

Current text:

G: cp [is] in {}

This category includes the code points that [sic] property values in
versions of Unicode after 5.1 have changed in such a way that the
derived property value would no longer be PVALID or DISALLOWED. If
changes are made to future versions of Unicode so that code points
might change property value from PVALID or DISALLOWED, then this
table can be updated and keep special exception values so that the
property values for code points stay stable.

Suggested revised text:

G: if exists IDNA_Backwards_Compatible_Table
      then cp is in IDNA_Backwards_Compatible_Table
      else cp is in {}

This category includes any code point for which the property values
in any subsequent version of Unicode after 5.1 have changed in such
a way that the derived property value of PVALID (in an earlier
version) has changed to DISALLOWED in the subsequent version,
when calculated without adjustment for backwards compatibility.

The intent of this category is to track a list of any special
exception values required to ensure that any code point once
considered PVALID for IDNA for any version of Unicode, will continue
to be PVALID for all future versions of Unicode.

==================================================================

Section 3. Calculation of the Derived property

Current text:

2. If the code point is in BackwardCompatible (Section 2.7), the
   value is according to the table in Section 2.7.

Suggested revised text:

2. If the code point is in BackwardCompatible (Section 2.7), the
   value is PVALID.

===================================================================

Section 5.1. IDNA derived property value registry

Current text:

IANA is to keep a list of the derived property for the versions of
Unicode that is [sic] released after (and including) version 5.1. The
derived property value is to be calculated according to the
specifications in sections [sic] Section 2 and Section 3 and not by copying
the non-normative table found in Appendix B.  If needed, IANA should
(with the help of an appointed expert) suggest updates of this RFC
where BackwardCompatible (Section 2.7) is updated, a set that is at
release of this document is [sic] empty.

Suggested revised text:

IANA is to keep a list of the derived property values for the versions of
Unicode that are released after (and including) version 5.1. The
derived property values are to be calculated according to the
specifications in Section 2 and Section 3 and not by copying
the non-normative table found in Appendix B.

At the release of this RFC, which corresponds with Unicode version 5.1,
there is no IDNA_Backwards_Compatible_Table. For any subsequent
version of Unicode, when the derived property values are calculated
for that version, if the calculation demonstrates that the
derived property value for any code point has changed from PVALID
in the prior version to DISALLOWED in the current version, then
IANA is to add that to the IDNA_Backwards_Compatible_Table to
be kept with the list of derived property values for that version
of Unicode.

===================================================================

That's it, I think.

Some items of note for discussion:

1. This focusses on the *important* backwards compatibility issue
here, which is to ensure that if a character is ever PVALID it
must always *stay* PVALID in subsequent versions. The transition
from DISALLOWED to PVALID, if it ever occurs, is of much less
concern, because it is not materially different from the
transition UNASSIGNED to PVALID, which will be common. I don't
see any good rationale for mucking up the statement of
BackwardCompatible for IANA and for implementers by worrying
about keeping things DISALLOWED. If any character ever becomes
sufficiently controversial that the issue has to be opened up
for decision, it would be a matter of handling the special
exceptions list and require IETF review, anyway.

2. As stated above, the issue for IANA is simply one of completely
rote derivation of the lists -- which I think is precisely what
we are after here. The specification says essentially:

   a. perform the derivation and post the new list
   b. check the list for backwards compatiblity with the
      prior version's list, to detect any PVALID-->DISALLOWED
      transition
   c. if any occur, *add* them to the IDNA_Backwards_Compatible_Table
      and rederive the list, so those code points stay PVALID

Simple to do. Error-proof, as long as the rest of the derivation
is stated clearly enough.

3. We all hope that all of this will be moot forever, anyway. We
are just building in the escape clause in the specification to
deal with the eventuality *if* an undesirable change occurs in
character properties, which would otherwise impact IDNA stability.
Given all the other constraints on Unicode character properties
these days, the chances of any *actual* case happening is rather
small, and is vanishingly small for any but the really obscure
historical characters in lesser-known scripts, where the character
properties are less well-understood in the first place.

4. Finally, if the working group participants don't like the
approach of starting with no IDNA_Backwards_Compatible_Table and
only creating one under the eventuality of its future requirement,
Section 2.7 and Section 5.1 could be slightly modified to
assume that an IDNA_Backwards_Compatible_Table exists from
the start, but starts out empty. This is a little easier to
state, but in practice it only would make any difference if we
truly believed that such backwards compatible exceptions would
start to surface sometime soon.

--Ken