Document:  draft-ietf-ltru-matching-15.txt
Reviewer: Elwyn Davies [elwynd@dial.pipex.com]
Review Date:  Thursday 6/29/2006 9:20 AM CST
IESG Telechat Date:  Thursday, 6 July 2006

Summary: This document is almost ready for BCP.  I have one issue with it (noted 
below) which IMO needs to be considered. There are also two 'bugs' in 
the specification of character codes and a number of suggested editorial 
changes which could be passed to the RFC Editor

Bugs:
Assuming we are supposed to be using RFC4234 conventions:
s2, para 3: s/%2A/%x2A/
s3.3.2, para 2 (Item 1.): s/%2D/%x2D/

Comment/Issue:
I find the whole use of wildcards in extended language ranges 
counter-intuitive.  The idea that de-de and de-*-de provide the same 
matches but the * in *-de is meaningful makes my brain hurt, however I 
see some of the reasoning which has lead to this unpleasant result.

Be that as it may, if I understand the intention of this document 
correctly, the syntactical specification of extended language ranges in 
s2.2 is intended to allow for a number of matching algorithms, not 
necessarily defined in the draft.  It therefore seems inappropriate to 
partially specify the semantics in s2.2 as they are used in the extended 
filtering algorithm defined in s3.3.2 when some other matching algorithm 
might apply different semantics. Any semantics that are specified in 
s2.2 should be those that are intended to apply for any matching 
algorithm - there may not be any!

My initial response to s2.2 before reading s3.3.2 was as follows:
s2.2: Sequences of wildcards and empty subtag sequences:  This section 
probably needs to be more explicit in a number of ways:
a) Does 'any sequence of subtags' include 'an empty sequence of subtags'?
b) The ABNF allows forms such as aa-*-*-zz:  Is this intended?  If so is 
it equivalent to aa-*-zz or not?
c) As a particular case of (b) is *-*-zz equivalent to *-zz or is the 
wildcard on the primary language tag special?
d) Is it necessary to add a trailing wildcard to soak up subtags after 
the last match (this might become obvious in the matching algorithms but 
a comment here might help)?

To summarize, I suggest:
- Removing the last paragraph of s2.2 and replacing it with a summary of 
any general semantics expected to apply to an Extended Language Range 
(e.g., any sequence of wildcards is equivalent to one wildcard, final 
wildcards can be omitted, a wildcard matches any sequence (might be 
empty or non-empty) of subtags)
- Reordering s3.3.2 to describe the intention and give the examples 
before specifying the algorithm - this would considerably aid 
understanding IMO.  Maybe clarify that -*-* is equivalent to -* etc.

Editorial:
s1, para 4: Presentation might be clearer as a bulleted list.
s1, para 5: s/-14/(actual version number)/

s2, para 1:  Make it clearer that  '(section 14.4)' applies to RFC2616 
and not the current document.
s2, para 3: A cross-reference to Section 2.1 of [RFC3066bis] where the 
syntax of language tags is defined would be helpful. 
s2, para 3: For consistency, "hyphen" should be given as a hex value 
also as in [RFC3066bis] - hyphen ("-",    ABNF [RFC4234] %x2D).

s2.1, para 1: The phrase 'has the same syntax as an [RFC3066] language 
tag' is undesirable since this specification obsoletes RFC3066.  
Something like 'has either a syntax which reflects the overall structure 
of [RFC3066bis] language tags or ...' would be better.
s2.1, para 3: s/Such ill-formed ranges/Any ranges that are not well-formed/

s2.3, para 1: Transliteration/Internationalization issue: Many of the 
Sámi using peoples transliterate Sámi as Saami  rather than Sami.