Remider: BIDI inter-label tests in -02
John C Klensin
klensin at jck.com
Sat Sep 6 00:44:30 CEST 2008
--On Friday, 05 September, 2008 22:54 +0200 Harald Alvestrand
<harald at alvestrand.no> wrote:
>...
>> Harald,
>>
>> FWIW, let me describe where I think we emerged from Dublin on
>> this (some of this fall into the category of what I think of
>> as corollaries to the discussions, not the meeting discussions
>> themselves).
>>
>> (1) Any requirement for inter-label checking is a
>> showstopper for DNS reasons and will remain a
>> showstopper regardless of anything this WG may or may
>> not wish to conclude. Put differently, including a
>> requirement for inter-label checking in a document is
>> just a way to ensure that the document will be shot down
>> by the DNS community during Last Call. Andrew or
>> others who raised the issue in Dublin might want to
>> clarify or affirm this, but my impression is any
>> statement that uses 2119-normative language (other than
>> MAY) would constitute a requirement in that regard.
>>
> I would like those who hold this position to speak up. I don't
> understand that position, and would like to understand it
> (whether I agree with it or not) before giving up on this
> point.
See Andrew's note.
>> (2) URIs do not contain domain names in U-label form.
>> It is, at best, in poor stylistic taste for them to
>> contain non-ASCII characters in the domain field using
>> percent-encoding of U-labels rather than A-labels.
>> Because there are no manifest RtoL characters in URIs
>> (because there are no non-ASCII characters), there are
>> no RtoL-related URI display issues.
>>
> I was trying hard to not mention URIs in *this* message at
> all, because the question of IRIs vs URIs and
> what-occurs-where is, to my mind, both very knotty and deeply
> irrelevant to the question I am trying to ask, which is all
> about what the lookup process of a domain name with U-labels
> in it is permitted to do. So I'll ignore this issue on this
> particular thread.
Ok, but, because of the issue that follows, these aren't quite
separable.
>> (3) Some referral/indirection URIs constitute an
>> interesting challenge. Regardless of what current (and
>> draft) versions of the URI and IRI may say (or be
>> construed as saying), the domain-part of a URI is
>> clearly a "domain name slot" as that term is defined in
>> IDNA2003 (the definition in IDNA2008 is no different,
>> but I want to stress that this is not a new decision).
>> As such, it is expected to contain a U-label or A-label
>...
>> So one can have the "running text" problem, with or
>> without RtoL characters, even inside a URL and without
>> worrying about "paragraphs".
>>
>> If this is a problem for IRIs (and whether or not it is
>> is debatable), it is not a problem for this WG.
> Good point, and to my mind a very good example of why we
> should just focus on what happens to domain names when they
> are displayed as if they were plain text.
>> Now, while I have never been an advocate of positions like "we
>> can't address all of the cases and solve all of the problems,
>> therefore we should do nothing", I'm finding that this leads
>> me to a position close to Alireza's conclusion (if I
>> understand that conclusion correctly). However, I also see
>> zone policies and registration procedures as an important
>> part of the protocol. To me, that means removing all of the
>> normative language from the bulleted paragraph above and
>> replacing it with some lavish advice that points out the
>> nasty things that can happen when naive (or not-so-naive)
>> rendering engines display labels containing certain types of
>> characters in certain positions next to labels containing
>> certain other types of characters. I think that advice
>> should explain the cases, give examples, and (i) indicate
>> that administrators of zones that contain RtoL characters in
>> labels or that point into such zones (via CNAME, DNAME, and
>> maybe URI-containing NAPTR records) ought to be very careful
>> what they do and wish for lest massive user confusion and
>> astonishment occur and (ii) that applications software that
>> renders these strings in native-character form (certainly
>> including URI-> IRI conversion and display programs) ought to
>> be very sensitive to these issues as well, perhaps contriving
>> to warn users that what they are seeing might not be what
>> they might expect to see.
>>
> I think the text you are asking for is already present in the
> document, but would like your suggestions for improving the
> text.
I will try to review it this weekend in the light of things I
think I understand now but clearly did not understand going into
the Dublin meeting.
> So I'll conclude that you think I should remove the bulleted
> point. Is that a correct interpretation?
yes. see above.
>> Much as I'd like to do more, I don't see a path that would
>> permit us to do so.
> If your conclusion that the bullet I've called out, as it is
> now written, will cause the document to be blocked
> indefinitely is correct, I agree. I would, however, like to
> have the people who hold that position to explain their
> position.
Again, see Andrew's note as a starting point. If we can make
those problems go away somehow, my position on this softens.
But, perhaps unlike Andrew, I do not believe that changing the
terminology in Protocol will help at all with the underlying
problem.
john
More information about the Idna-update
mailing list