tables document [Re: IDNA comments]

Kenneth Whistler kenw at sybase.com
Mon Jul 14 21:57:55 CEST 2008


Patrik,

> > Comments on tables-01

I'm re-reviewing these comments in the light of the newly
posted tables-02 document.


> >   3. "It should be suitable for newer revisions of Unicode, as long  
> > as the
> >   Unicode properties on which it is based remain stable."
> >   Replace by
> >   "This is suitable for any newer versions of Unicod as well.
                                                  ^^^^^^

Typo in Mark's input that made it into table-02.txt.
                                                    
> Thanks. I will though probably in the last sentence not say "...will  
> be..." but instead "...can be..." as my view is that at the point in  
> time where such an unfortunate incompatibility is detected, this  
> document has to be updated. At that update one can choose to either  
> add the codepoint to 2.2.3 or not (in reality, choose to add to 2.2.3  
> and update the document or not update the document but accept the  
> incompatibility).
> 
> Ok with people?

And yes, I'm fine with that suggestion, too.

> >   7. "In many cases aliases are used in the data in the Unicode  
> > Standard.
> >   This document uses both the alias and the spelled out terms (for  
> > example
> >   alias Ll for the General Category Lowercase_Letter)."
> >   Replace with:
> >   "Unicode property names and property value names may have short
> >   abbreviations, such as gc for the General_Category property, and  
> > Ll for the
> >   Lowercase_Letter property value of that property."
> 
> Is it only property names and property values that have short forms?

In general, yes. What Mark is referring to are the short forms
specified in PropertyAliases.txt (for property names) and
PropertyValueAliases.txt (for property value names).

> >   8. Sort the following by value instead of code point, for clarity.
> >   Ideally each value would be in its own subsection: PVALID,  
> > CONTEXTO,...
> >
> >      002D; CONTEXTO  # HYPHEN-MINUS
> >      ...
> >      3007; PVALID    # IDEOGRAPHIC NUMBER ZERO
> >      303B; CONTEXTO  # VERTICAL IDEOGRAPHIC ITERATION MARK
> >      30FB; CONTEXTO  # KATAKANA MIDDLE DOT
> 
> Hmm...what do people think here? I can see reasons to have the  
> codepoints (in the same script) "close" to each other (as it is now),  
> while still of course understand this suggestion.

I agree that the list isn't long enough to bother reorganizing
it by category here.

> 
> Should also the appendix be sorted in a different way (add an Appendix  
> B in addition to existing Appendix A)?

No. The least contentious (and least voluminous) way forward is
just to leave the Appendix A as it is.



> >   13. "If needed, IANA should (with the help of an appointed expert)
> >   suggest updates of this RFC where BackwardCompatible (Section  
> > 2.2.3) is
> >   updated, a set that is at
> >   release of this document is empty."
> >   This isn't going to work. I suggest that the backwards compatible
> >   character list, the exceptions list, and the context rules all be  
> > in a
> >   single document published by IANA, and controlled by the group  
> > discussed in
> >   rationale. 
> 
> Ok.

I don't see that Mark's concern has actually been addressed
yet on this point (Section 5. IANA Considerations). Rather
than IANA suggesting updates to the RFC, it seems to me
that it would be cleaner if the RFC specified that the
BackwardCompatible list was maintained by IANA along with
its listing of the derived property for each version
of Unicode. The need to add anything to the BackwardCompatible
list will be apparent when the derivation is attempted for
each new version, if anything were to change that required
its invocation. It is much quicker and cleaner if that
exception list were then generated and maintained by
the same process involved in generating the derived property
list(s), rather than requiring kicking off a document
revision process to get the RFC updated.

After all, isn't one of the main points here to construct
the RFC so that it doesn't have to be updated with future
revisions of Unicode?

--Ken

P.S. In general, I'm quite pleased with the progress this
document has made. I find it much simpler and clearer
now.

> 
>     Patrik



More information about the Idna-update mailing list