The Two Lookups Approach (was Re: Parsing the issuesand finding a middle ground -- another attempt)

Sun Mar 8 04:44:20 CET 2009

Hi John,

In general, I think I agree with you that it would be confusing to
have names for each of the nine subcategories. However, in our
discussions, some of us have come up with names for certain commonly
discussed things (Martin's L-label for locally mapped labels, my
V-label for variant (globally mapped) labels and Mark's C-label, as
you say). I think we would need more consensus before any of this is
adopted. G-label might be better than V-label, if it is to mean
globally mapped label.

Another aspect of this, which I am not sure you've captured in your
drafts, is the Unicode version in use in a particular U-label. For
example, in a protocol involving U-labels, it might be good to specify
what to do when a sender sends a U-label containing characters from a
newer version of Unicode than the receiver has implemented.

Erik

On Sat, Mar 7, 2009 at 12:30 PM, John C Klensin <klensin at jck.com> wrote:
> One caveat about this, independent of most of the current
> thread.  I tried to reflect the model for discussion in Appendix
> 1 of the recently-posted Protocol-10.   I wrote too hastily and
> while too tired and botched the definition.   I suggest either
> reading the text for general principles rather than the specific
> mechanism or waiting for the revised version (which will appear
> before the cutoff).
>
> An almost-similar comment applies to the workaround for using
> "A-label" and "U-label" on lookup.  The lookup checks don't
> ensure that all of the criteria for {registration) A-labels and
> U-labels are met, so the unqualified use of those terms on
> lookup is not strictly correct (as Mark has pointed out several
> times).  The original approach to that problem (many drafts ago
> and before the problem existed, much less was pointed out) was
> to describe labels that has not satisfied all tests as "putative
> A-labels" and/or "putative U-labels" -- strings that looked more
> or less like those label forms and that were claimed, by context
> and form, to be them, but that had not yet passed all of the
> relevant tests.   Some participants in the WG objected strongly
> to "putative", so I eliminated it in several contexts, resulting
> (or reinforcing) the definitional problem in Lookup.  In
> Protocol-10, I tried using "apparent U-label".  That doesn't
> quite work either, so -11 will use a different approach,
> eliminating that terminology entirely from the lookup side (but
> resulting in slightly more convoluted text).
>
> Mark suggested a different model, which was to introduce
> "C-label" as a term for the superset of A-labels that met the
> lookup criteria and restrictions but not necessarily all of the
> A-label criteria. I liked that idea (due to an editing error,
> Protocol-10 contains a vestige of my trying to fit it in).  But,
> in the process of working on the text, I realized that we
> actually have many categories here and that a terminology
> solution would require introducing terms for more than just one
> more of them.  In particular, there appear to be:
>
>        * strings in IDNA-aware slots that no one has looked at
>        yet.
>        * strings in such slots that contain non-ASCII
>        characters but that have not yet been subjected to any
>        of the validation tests for U-labels.
>        * strings of that variety that have been passed some of
>        the validation tests for U-labels, but not even enough
>        to be valid for Punycode conversion and lookup.
>        * strings of that variety that have passed all of the
>        validation tests needed for Punycode conversion and
>        lookup, but not the additional tests (CONTEXTO, Bidi)
>        required for U-labels
>        * U-labels
>
> and, similarly,
>
>        * strings in IDNA-aware slots that start in "xn--" (case
>        independent) but have not otherwise been subjected to
>        any of the validation tests for A-labels.
>        * strings of that variety that have passed some of the
>        validation tests for A-labels, but not sufficient of
>        them to determine that they are valid for looking up in
>        the DNS.
>        * strings that have been determined to be valid for
>        looking up in the DNS but that have not been checked for
>        the additional criteria needed to qualify them as
>        A-labels.
>        * A-labels
>
> While I'm happy to change things if the WG prefers (and comes up
> with appropriate terms and definitions), my editorial judgment
> is that trying to solve the problem of partially-checked strings
> by adding terminology to identify different of the nine
> subcategories above is more likely to confuse the reader than to
> help with understanding or implementations (even though it would
> arguably improve precision).
>
> As suggested above, protocol-10 tried to address the issue by
> talking about "apparent" U-labels.  Only after I was about ready
> to post it did I realize that returned us to "putative" in
> slightly different clothing (and that is noted in the draft).
> Protocol-11 eliminates the problematic text by going back to the
> convoluted description form that Patrik criticized earlier in
> the context of the U-label terminology.  I don't know if that is
> the best long-term solution, but, if it is not, I need help from
> the WG in figuring out a better one.
>
>     john
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>