Document: draft-newman-i18n-comparator-13.txt Reviewer: Spencer Dawkins [spencer@mcsr-labs.org] Review Date: Tuesday 8/15/2006 7:32 AM CST IESG Telechat Date: Thursday, 17 August 2006 This is a re-review, my previous review was for 06, with Scott as shepherding AD, before IETF 65. I'm reading the deltas from 06 (in the spirit of not finding new problems with previously-reviewed text). Summary: Again, nearly ready for publication as Proposed Standard, with some (new) items that do need to be addressed before publication. Review Comments: 2.2. Purpose Collations abstraction layer for comparison functions so that these comparison functions can be used in multiple protocols. I am just barely able to parse this sentence so that it's not a sentence fragment. I think the problem is that "functions" is being used as a verb and as a noun in the same sentence. I saw later in the document that you had changed "function"-the-noun to "operation", so should be easy to fix. But this isn't an editorial comment, because I'm not sure what the sentence is saying. 4.2.2. Equality ... In this specification, the return values of the equality test are called "match", "no-match" and "undefined". This is not a specification, merely a choice of phrasing. What does the last sentence mean? (Brian Carpenter asked me, so he doesn't know, either). 5.2. Operations ... Although the collation's substring function provides a list of matches, a protocol need not provide all that to the client. It may provide only the first matching substring, or even just the information that the substring search matched. Hmmm. I am trying to remember that you're not defining a protocol, only describing what protocols do and don't do, but I'm trying to read this from the application's perspective, and having a hard time understanding how (for example) an application that is trying to display what is matching responds when the protocol only provides an indication that something matched. You may say this is what the protocol developers are supposed to worry about ("if you think applications will want to display what matches, you'd better define the protocol so that this information is returned"), and that's OK. I'm just struggling a bit here. 6. Use by Existing Protocols ... IMAP [16] also collates, although that is explicit only when the COMPARATOR [18] extension is used. The built-in IMAP substring operation and the ordering provided by the SORT [17] extension may not meet the requirements made in this document. Other protocols may be in a similar position. In IMAP, the default collation is i;ascii-casemap, because its operations most closely resembles IMAP's built-in operations. EDITORIAL: I'm guessing that the previous paragraph should be moved up one? At the very least, I'm confused because I'm not sure if the top paragraph in this extract describes the differences between i;ascii-casemap and IMAP's built-in operations or is talking about something else. 9.1.1. ASCII Numeric Collation Description The "i;ascii-numeric" collation is a simple collation intended for use with arbitrary sized unsigned decimal integer numbers stored as octet strings. US-ASCII digits (0x30 to 0x39) represent digits of the numbers. Before converting from string to integer, the input string is truncated at the first non-digit character. All input is valid; strings which do not start with a digit represent positive infinity. Is it obvious to everyone except me that leading zeros are ignored? The examples giving a little further down say so - is making this point in examples normative enough? 9.2.1. ASCII Casemap Collation Description ... The i;ascii-casemap collation is well suited to to use with many internet protocols and computer languages. Use with natural language is often inappropriate: even though the collation apparently supports languages such as Italian and English, in real-world use it tends to stumble over words such as "naive", names such as "Llwyd", people and place names containing non-ASCII, euro and pound sterling symbols, quotation marks, dashes/hyphens, etc. OK, this may be inadvertantly funny - are "naive" and "Llwyd" supposed to include a non-ascii character, or is that sentence saying something else? (Welcome to the world of the RFC Editor) 13. Open Issues ... adding a note to the RFC editor to possibly replace the 3066 reference >From Brian: Surely this needs to be done? >From Spencer: I'm thinking that the "checking the SP SP "1" SP SP string for correctness" also needs to be done pretty soon :-0