Lookup CONTEXTJ test
Simon Josefsson
simon at josefsson.org
Sun Jan 9 16:22:37 CET 2011
Vint Cerf <vint at google.com> writes:
> simon:
>
> from RFC5894
Thank you, that was the explanation I was looking for.
I note that RFC5894 is not a normative reference from IDNA2008, nor a
standards track document, so if we want the explanation in that section
to have bearing on implementations, that should be fixed.
/Simon
> 3.1.2.2. Rules and Their Application
>
> Rules have descriptions such as "Must follow a character from Script
> XYZ", "Must occur only if the entire label is in Script ABC", or
> "Must occur only if the previous and subsequent characters have the
> DFG property". The actual rules may be DEFINED or NULL. If present,
> they may have values of "True" (character may be used in any position
> in any label), "False" (character may not be used in any label), or
> may be a set of procedural rules that specify the context in which
> the character is permitted.
>
> Because it is easier to identify these characters than to know that
> they are actually needed in IDNs or how to establish exactly the
> right rules for each one, a rule may have a null value in a given
> version of the tables. Characters associated with null rules are not
> permitted to appear in putative labels for either registration or
> lookup. Of course, a later version of the tables might contain a
> non-null rule.
>
> The actual rules and their descriptions are in Sections 2 and 3 of
> the Tables document [RFC5892]. That document also specifies the
> creation of a registry for future rules.
>
>
>
> On Sun, Jan 9, 2011 at 5:30 AM, Simon Josefsson <simon at josefsson.org> wrote:
>
>> Vint Cerf <vint at google.com> writes:
>>
>> > simon:
>> >
>> > Suppose you are (well, your software is) examining a string to determine
>> > whether to look it up in the DNS and, on examination, you discover a
>> > character in an apparent U-label that is labeled "CONTEXTJ". CONTEXTJ is
>> one
>> > of the several special handling rules in IDNA2008. Upon examination, you
>> > discover that the character does NOT conform to the rule associated with
>> > CONTEXTJ. At this point you should cease further examination and reject
>> the
>> > string as not being acceptable even for lookup in the DNS. If the
>> character
>> > satisfies the associated CONTEXTJ rule, you may continue to examine the
>> > string prior to looking it up.
>> >
>> > Your interpretation (2) is the correct one. The idea is to allow the use
>> of
>> > "joiner" characters only under specific conditions.
>>
>> Thank you, this is clear to me now.
>>
>> > A "null" rule is a condition that has no specific actions associated with
>> > it. It's like defining a class of characters (perhaps by their Unicode
>> > properties) for purposes of singling them out for special treatment, and
>> > then not saying what should be done about them. If there is no rule, and
>> if
>> > a character in a string under examination meets the condition, the string
>> > must be rejected if the condition does not have a defined rule (action)
>> > associated with it. The lack of a rule means there is no test to perform
>> and
>> > it is interpreted in IDNA2008 as having failed implicitly. "null rule"
>> means
>> > a rule that is "empty" "has no content" "missing" "awol"
>>
>> When would this situation occur?
>>
>> /Simon
>>
>> >
>> > v
>> >
>> >
>> >
>> >
>> > On Sat, Jan 8, 2011 at 5:07 AM, Simon Josefsson <simon at josefsson.org>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> I need help with interpretation regarding section 5.4 which says:
>> >>
>> >> Putative U-labels with any of the following characteristics MUST be
>> >> rejected prior to DNS lookup:
>> >> ...
>> >> o Labels containing code points that are identified in the Tables
>> >> document as "CONTEXTJ", i.e., requiring exceptional contextual
>> >> rule processing on lookup, but that do not conform to those rules.
>> >>
>> >> I have trouble understand the bullet text. To me, it seems as if the
>> >> first part of the sentence, namely:
>> >>
>> >> Labels containing code points that are identified in the Tables
>> >> document as "CONTEXTJ"
>> >>
>> >> says one thing but the rest of the sentence, namely:
>> >>
>> >> requiring exceptional contextual rule processing on lookup, but
>> >> that do not conform to those rules.
>> >>
>> >> says a different thing.
>> >>
>> >> What is not clear to me is whether the test on a particular label is
>> >> intended to fail if and only if:
>> >>
>> >> 1) the label has any code point with the CONTEXTJ property.
>> >>
>> >> 2) the label has any code point with the CONTEXTJ property AND the rule
>> >> fails.
>> >>
>> >> Interpretation 2) makes the most sense to me, but the normative part of
>> >> the sentence suggests otherwise so I am looking for clarification.
>> >>
>> >> The text goes on and says:
>> >>
>> >> Note that this implies that a rule must be defined, not null: a
>> >> character that requires a contextual rule but for which the rule
>> >> is null is treated in this step as having failed to conform to the
>> >> rule.
>> >>
>> >> What is a "null rule"? I cannot find any definition.
>> >>
>> >> /Simon
>> >> _______________________________________________
>> >> Idna-update mailing list
>> >> Idna-update at alvestrand.no
>> >> http://www.alvestrand.no/mailman/listinfo/idna-update
>> >>
>> > _______________________________________________
>> > Idna-update mailing list
>> > Idna-update at alvestrand.no
>> > http://www.alvestrand.no/mailman/listinfo/idna-update
>>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
More information about the Idna-update
mailing list