[Gen-art] LC review: draft-ietf-idnabis-bidi-06.txt

Harald Alvestrand harald at alvestrand.no
Mon Oct 5 19:00:50 CEST 2009

Joel M. Halpern wrote:
> Probably the simplest solution for the first part of the "major" 
> problem is to add text at the front of the rules section (section 2), 
> that states something like:
> The rules about character properties given here are additional 
> restrictions on what characters are permitted in specific DNS labels, 
> augmenting the basic rules as defined in [ref...]
> That would make clear that "." is not allowed just because it is 
> flagged CS, and would tell the reader where to look if they are 
> reading in other than the recommended order, or if it has been too 
> long since they read the other document.
> However, that still leaves the confusion about the fact that section 3 
> states that CS is prohibited, when section 2 says that CS is 
> permitted.   I do not know enough about what is being intended to 
> suggest clarifying text for the paranethetical in section 3.
Thanks for catching this.

The full set of CS characters is:

002E;FULL STOP;Po;0;CS;;;;;N;PERIOD;;;;
00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
060C;ARABIC COMMA;Po;0;CS;;;;;N;;;;;
202F;NARROW NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;;;;;
2044;FRACTION SLASH;Sm;0;CS;;;;;N;;;;;
FE50;SMALL COMMA;Po;0;CS;<small> 002C;;;;N;;;;;
FE52;SMALL FULL STOP;Po;0;CS;<small> 002E;;;;N;SMALL PERIOD;;;;
FE55;SMALL COLON;Po;0;CS;<small> 003A;;;;N;;;;;
FF0C;FULLWIDTH COMMA;Po;0;CS;<wide> 002C;;;;N;;;;;
FF1A;FULLWIDTH COLON;Po;0;CS;<wide> 003A;;;;N;;;;;

At the moment, I can't reconstruct what we were thnking when we decided 
to allow it.
The parenthetical remark can go without great loss; nothing depends on 
it. But I wondre why we are allowing CS at all.

> Yours,
> Joel
> John C Klensin wrote:
>> Joel,
>> Speaking for myself only...
>> It may or may not have been wise for the Gen-ART group to parcel
>> this documents out to different reviewers.  On the one hand, it
>> reduces the burden but, on the other, they are closely related
>> and, without seeing the relationship, confusion is not only
>> possible but likely.  Realistically, no one should ever
>> implement "bidi" without implementing all of IDNA2008; to try to
>> do so will lead to all sorts of "interesting" problems.
>> You have spotted one of those.  When the bidi document says that
>> characters from particular categories are allowed, that is not
>> an assertion that _all_ code points falling into any of those
>> categories are allowed, only that the categories are allowed
>> (and need to be treated in bidi-specific ways).  Which
>> characters are actually allowed is specified by application of
>> the rules in "Tables" as invoked and selected from "Protocol".
>> There was extensive discussion in the WG about how, and whether,
>> to split the documents up the way they are divided and
>> reasonably strong (I personally believe stronger than just
>> "rough") consensus that the current arrangement represents the
>> best balance we could arrive upon.
>> If you can suggest ways to make the relationships and, in
>> particular, the relationship of the categories used in Bidi and
>> the character and code-point selections and restrictions in
>> Tables and Protocol, more clear, I assume that the document
>> authors and WG would welcome those suggestions.
>> regards,
>>    john
>> --On Monday, October 05, 2009 10:43 -0400 "Joel M. Halpern"
>> <jmh at joelhalpern.com> wrote:
>>> I have been selected as the General Area Review Team (Gen-ART)
>>> reviewer for this draft (for background on Gen-ART, please see
>>> http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html ).
>>> Please resolve these comments along with any other Last Call
>>> comments you may receive.
>>> Document: draft-ietf-idnabis-bidi-06.txt
>>>      Right-to-left scripts for IDNA
>>> Reviewer: Joel M. Halpern
>>> Review Date: 5-Oct-2009
>>> IETF LC End Date: 14-Oct-2009
>>> IESG Telechat date: N/A
>>> Summary: This document is nearly ready for publication as a
>>> proposed  standard.
>>> There is one comment I have marked as Major.  I presume that
>>> the actual  problem is not a defect in the intent of the spec,
>>> but a defect in this  readers understanding.  Presuming such,
>>> I would ask that clarifying text  be added.
>>> Major issues:
>>> In section 2, when describing the rules for what is allowed in
>>> labels,  CS is allowed in labels.  It is not allowed to start
>>> RTL labels.  This  looks fine, until I realized that CS
>>> includes ".", which I am pretty  sure is not allowed in a
>>> label.
>>> This gets further complicated in section 3, when talking about
>>> "The  Character Trouping requirement", the text talks about
>>> "Delimiterchars"  being CS, WS, or ON.  A parenthetical then
>>> says "They are not allowed in  domain labels."
>>> Since the normative text said that CS is allowed, there seems
>>> to be a  problem.

More information about the Idna-update mailing list