Issues list for draft-ietf-idnabis-tables document

Tue Jul 15 01:07:21 CEST 2008

Patrik,

My feedback on these issues:

> P1: Order of codepoints in for example the exceptions table
> 
> Current state: Order is by codepoint value
> 
> Proposal: Sort by value instead of code point, for clarity. Ideally  
> each value would be in its own subsection: PVALID,
> CONTEXTO,...
> 
> Comment from editor: I have seen more voices against this proposal  
> than in favor.
> 
> Suggested action: None

I concur with that position. The list isn't long enough for
restructuring it to be critical to understanding it. It is
good enough as it is.

> P2: Split appendix A (non-normative list of codepoints) in one list  
> per property value
> 
> Current state: The list is not split
> 
> Proposal: Split the list in one per property value, or at least order  
> by property value.
> 
> Comment from editor: I suggested this after seeing [P1]. Voices where  
> rised against this proposal.
> 
> Suggested action: None

I concur with that position. It isn't worth doing for this
appendix listing.

> P3: IANA is to update the Backward Compatible list.
> 
> Current State: IANA is to host a list of derived values such as the  
> one in appendix A, and update this list whenever a new version of  
> Unicode is published. IANA is not to update any of the tables in this  
> document. That is done only by updating this RFC.
> 
> Proposal: Have the backwards compatible character list, the exceptions  
> list, and the context rules all be in a single document published by  
> IANA, and controlled by the group discussed in rationale.
> 
> Comment from editor: Most voices want changes to this document be by  
> updating it, i.e. by requiring IESG action.
> 
> Suggested action: None

I favor an intermediate position. I think the backwards compatible
character list should be part of the documentation of the derived
property for each version of Unicode which is called for in
the current document. This is the *easy* and automatic part of
updating, and doesn't require engaging any long, drawn-out
review and approval process, in my opinion. The whole point
of having *any* entry in the backwards compatible character list
is simply to keep the derivation stable, on the off chance
that some future version of Unicode changes a character property
value that would otherwise result in removal of PVALID status
for some character that was PVALID as of Unicode 5.1.

Context rules are another thing altogether. I'm generally not
in favor of adding any of them, other than the minimum required
to deal with bidi and the allowance of ZWJ/ZWNJ -- which should
be handled in the initial definition of the protocol. After
that, if some new context rule proves necessary, then I agree
it should probably be done with thorough review, and in the IETF
context, if that requires updating the RFC and requiring IESG
action, then so be it. So that is a good reason to leave the
definition of any context rules in the RFC(s) and not part
of a IANA-maintained list.

> 
> P4: Remove Appendix A
> 
> Current state: Appendix A holds a non-normative table of derived  
> property values for Unicode 5.1.
> 
> Proposal: Remove the appendix as developers and readers do not  
> understand the difference between the appendix being normative or non- 
> normative.
> 
> Comment from editor: At least during development of this I-D, the  
> appendix has helped. I also think personally people do understand the  
> difference. So I personally am not as worried.
> 
> Suggested action: None

I concur with that position. I think the document is careful
enough in claiming what is normative and what is not.

The lingering confusion here is that ultimately implementers
will need a reliable list, rather than attempt to continually
independently implement what is a rather tricky derivation
(as demonstrated by the long process of shaking down the
bugs in the table derivation for this document).

This is a case, unlike that of many internet protocols, where
it simply isn't going to be feasible to write and publish
code for an implementation of the algorithm for the derivation,
particularly if it doesn't depend on some already implemented
major library like ICU to get the right answers.

It would be a whole lot friendlier to implementers if the
IANA derived list could be *certified* correct (i.e., have
some kind of normative status), so that folks who implement
IDNAbis can depend on that exact list (which will be the
same for everybody). That is a reduce-the-risk strategy,
it seems to me, since the rest of the IDNA code for a
protocol implementation is already complicated enough.

--Ken