[Gen-art] LC review: draft-ietf-idnabis-bidi-06.txt
harald at alvestrand.no
Mon Oct 5 19:00:50 CEST 2009
Joel M. Halpern wrote:
> Probably the simplest solution for the first part of the "major"
> problem is to add text at the front of the rules section (section 2),
> that states something like:
> The rules about character properties given here are additional
> restrictions on what characters are permitted in specific DNS labels,
> augmenting the basic rules as defined in [ref...]
> That would make clear that "." is not allowed just because it is
> flagged CS, and would tell the reader where to look if they are
> reading in other than the recommended order, or if it has been too
> long since they read the other document.
> However, that still leaves the confusion about the fact that section 3
> states that CS is prohibited, when section 2 says that CS is
> permitted. I do not know enough about what is being intended to
> suggest clarifying text for the paranethetical in section 3.
Thanks for catching this.
The full set of CS characters is:
00A0;NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;NON-BREAKING SPACE;;;;
202F;NARROW NO-BREAK SPACE;Zs;0;CS;<noBreak> 0020;;;;N;;;;;
FE50;SMALL COMMA;Po;0;CS;<small> 002C;;;;N;;;;;
FE52;SMALL FULL STOP;Po;0;CS;<small> 002E;;;;N;SMALL PERIOD;;;;
FE55;SMALL COLON;Po;0;CS;<small> 003A;;;;N;;;;;
FF0C;FULLWIDTH COMMA;Po;0;CS;<wide> 002C;;;;N;;;;;
FF0E;FULLWIDTH FULL STOP;Po;0;CS;<wide> 002E;;;;N;FULLWIDTH PERIOD;;;;
FF0F;FULLWIDTH SOLIDUS;Po;0;CS;<wide> 002F;;;;N;FULLWIDTH SLASH;;;;
FF1A;FULLWIDTH COLON;Po;0;CS;<wide> 003A;;;;N;;;;;
At the moment, I can't reconstruct what we were thnking when we decided
to allow it.
The parenthetical remark can go without great loss; nothing depends on
it. But I wondre why we are allowing CS at all.
> John C Klensin wrote:
>> Speaking for myself only...
>> It may or may not have been wise for the Gen-ART group to parcel
>> this documents out to different reviewers. On the one hand, it
>> reduces the burden but, on the other, they are closely related
>> and, without seeing the relationship, confusion is not only
>> possible but likely. Realistically, no one should ever
>> implement "bidi" without implementing all of IDNA2008; to try to
>> do so will lead to all sorts of "interesting" problems.
>> You have spotted one of those. When the bidi document says that
>> characters from particular categories are allowed, that is not
>> an assertion that _all_ code points falling into any of those
>> categories are allowed, only that the categories are allowed
>> (and need to be treated in bidi-specific ways). Which
>> characters are actually allowed is specified by application of
>> the rules in "Tables" as invoked and selected from "Protocol".
>> There was extensive discussion in the WG about how, and whether,
>> to split the documents up the way they are divided and
>> reasonably strong (I personally believe stronger than just
>> "rough") consensus that the current arrangement represents the
>> best balance we could arrive upon.
>> If you can suggest ways to make the relationships and, in
>> particular, the relationship of the categories used in Bidi and
>> the character and code-point selections and restrictions in
>> Tables and Protocol, more clear, I assume that the document
>> authors and WG would welcome those suggestions.
>> --On Monday, October 05, 2009 10:43 -0400 "Joel M. Halpern"
>> <jmh at joelhalpern.com> wrote:
>>> I have been selected as the General Area Review Team (Gen-ART)
>>> reviewer for this draft (for background on Gen-ART, please see
>>> http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html ).
>>> Please resolve these comments along with any other Last Call
>>> comments you may receive.
>>> Document: draft-ietf-idnabis-bidi-06.txt
>>> Right-to-left scripts for IDNA
>>> Reviewer: Joel M. Halpern
>>> Review Date: 5-Oct-2009
>>> IETF LC End Date: 14-Oct-2009
>>> IESG Telechat date: N/A
>>> Summary: This document is nearly ready for publication as a
>>> proposed standard.
>>> There is one comment I have marked as Major. I presume that
>>> the actual problem is not a defect in the intent of the spec,
>>> but a defect in this readers understanding. Presuming such,
>>> I would ask that clarifying text be added.
>>> Major issues:
>>> In section 2, when describing the rules for what is allowed in
>>> labels, CS is allowed in labels. It is not allowed to start
>>> RTL labels. This looks fine, until I realized that CS
>>> includes ".", which I am pretty sure is not allowed in a
>>> This gets further complicated in section 3, when talking about
>>> "The Character Trouping requirement", the text talks about
>>> "Delimiterchars" being CS, WS, or ON. A parenthetical then
>>> says "They are not allowed in domain labels."
>>> Since the normative text said that CS is allowed, there seems
>>> to be a problem.
More information about the Idna-update