Lookup CONTEXTJ test

Simon Josefsson simon at josefsson.org
Sun Jan 9 16:22:37 CET 2011


Vint Cerf <vint at google.com> writes:

> simon:
>
> from RFC5894

Thank you, that was the explanation I was looking for.

I note that RFC5894 is not a normative reference from IDNA2008, nor a
standards track document, so if we want the explanation in that section
to have bearing on implementations, that should be fixed.

/Simon

> 3.1.2.2.  Rules and Their Application
>
>    Rules have descriptions such as "Must follow a character from Script
>    XYZ", "Must occur only if the entire label is in Script ABC", or
>    "Must occur only if the previous and subsequent characters have the
>    DFG property".  The actual rules may be DEFINED or NULL.  If present,
>    they may have values of "True" (character may be used in any position
>    in any label), "False" (character may not be used in any label), or
>    may be a set of procedural rules that specify the context in which
>    the character is permitted.
>
>    Because it is easier to identify these characters than to know that
>    they are actually needed in IDNs or how to establish exactly the
>    right rules for each one, a rule may have a null value in a given
>    version of the tables.  Characters associated with null rules are not
>    permitted to appear in putative labels for either registration or
>    lookup.  Of course, a later version of the tables might contain a
>    non-null rule.
>
>    The actual rules and their descriptions are in Sections 2 and 3 of
>    the Tables document [RFC5892].  That document also specifies the
>    creation of a registry for future rules.
>
>
>
> On Sun, Jan 9, 2011 at 5:30 AM, Simon Josefsson <simon at josefsson.org> wrote:
>
>> Vint Cerf <vint at google.com> writes:
>>
>> > simon:
>> >
>> > Suppose you are (well, your software is) examining a string to determine
>> > whether to look it up in the DNS and, on examination, you discover a
>> > character in an apparent U-label that is labeled "CONTEXTJ". CONTEXTJ is
>> one
>> > of the several special handling rules in IDNA2008. Upon examination, you
>> > discover that the character does NOT conform to the rule associated with
>> > CONTEXTJ. At this point you should cease further examination and reject
>> the
>> > string as not being acceptable even for lookup in the DNS.  If the
>> character
>> > satisfies the associated CONTEXTJ rule, you may continue to examine the
>> > string prior to looking it up.
>> >
>> > Your interpretation (2) is the correct one. The idea is to allow the use
>> of
>> > "joiner" characters only under specific conditions.
>>
>> Thank you, this is clear to me now.
>>
>> > A "null" rule is a condition that has no specific actions associated with
>> > it. It's like defining a class of characters (perhaps by their Unicode
>> > properties) for purposes of singling them out for special treatment, and
>> > then not saying what should be done about them. If there is no rule, and
>> if
>> > a character in a string under examination meets the condition, the string
>> > must be rejected if the condition does not have a defined rule (action)
>> > associated with it. The lack of a rule means there is no test to perform
>> and
>> > it is interpreted in IDNA2008 as having failed implicitly. "null rule"
>> means
>> > a rule that is "empty" "has no content" "missing" "awol"
>>
>> When would this situation occur?
>>
>> /Simon
>>
>> >
>> > v
>> >
>> >
>> >
>> >
>> > On Sat, Jan 8, 2011 at 5:07 AM, Simon Josefsson <simon at josefsson.org>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> I need help with interpretation regarding section 5.4 which says:
>> >>
>> >>   Putative U-labels with any of the following characteristics MUST be
>> >>   rejected prior to DNS lookup:
>> >> ...
>> >>   o  Labels containing code points that are identified in the Tables
>> >>      document as "CONTEXTJ", i.e., requiring exceptional contextual
>> >>      rule processing on lookup, but that do not conform to those rules.
>> >>
>> >> I have trouble understand the bullet text.  To me, it seems as if the
>> >> first part of the sentence, namely:
>> >>
>> >>    Labels containing code points that are identified in the Tables
>> >>    document as "CONTEXTJ"
>> >>
>> >> says one thing but the rest of the sentence, namely:
>> >>
>> >>      requiring exceptional contextual rule processing on lookup, but
>> >>      that do not conform to those rules.
>> >>
>> >> says a different thing.
>> >>
>> >> What is not clear to me is whether the test on a particular label is
>> >> intended to fail if and only if:
>> >>
>> >> 1) the label has any code point with the CONTEXTJ property.
>> >>
>> >> 2) the label has any code point with the CONTEXTJ property AND the rule
>> >>   fails.
>> >>
>> >> Interpretation 2) makes the most sense to me, but the normative part of
>> >> the sentence suggests otherwise so I am looking for clarification.
>> >>
>> >> The text goes on and says:
>> >>
>> >>      Note that this implies that a rule must be defined, not null: a
>> >>      character that requires a contextual rule but for which the rule
>> >>      is null is treated in this step as having failed to conform to the
>> >>      rule.
>> >>
>> >> What is a "null rule"?  I cannot find any definition.
>> >>
>> >> /Simon
>> >> _______________________________________________
>> >> Idna-update mailing list
>> >> Idna-update at alvestrand.no
>> >> http://www.alvestrand.no/mailman/listinfo/idna-update
>> >>
>> > _______________________________________________
>> > Idna-update mailing list
>> > Idna-update at alvestrand.no
>> > http://www.alvestrand.no/mailman/listinfo/idna-update
>>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update


More information about the Idna-update mailing list