Remider: BIDI inter-label tests in -02

Sat Sep 6 00:44:30 CEST 2008

--On Friday, 05 September, 2008 22:54 +0200 Harald Alvestrand
<harald at alvestrand.no> wrote:

>...
>> Harald,
>> 
>> FWIW, let me describe where I think we emerged from Dublin on
>> this (some of this fall into the category of what I think of
>> as corollaries to the discussions, not the meeting discussions
>> themselves).
>> 
>> 	(1) Any requirement for inter-label checking is a
>> 	showstopper for DNS reasons and will remain a
>> 	showstopper regardless of anything this WG may or may
>> 	not wish to conclude.  Put differently, including a
>> 	requirement for inter-label checking in a document is
>> 	just a way to ensure that the document will be shot down
>> 	by the DNS community during Last Call.    Andrew or
>> 	others who raised the issue in Dublin might want to
>> 	clarify or affirm this, but my impression is any
>> 	statement that uses 2119-normative language (other than
>> 	MAY) would constitute a requirement in that regard.
>>   
> I would like those who hold this position to speak up. I don't
> understand that position, and would like to understand it
> (whether I agree with it or not) before giving up on this
> point.

See Andrew's note.

>> 	(2) URIs do not contain domain names in U-label form.
>> 	It is, at best, in poor stylistic taste for them to
>> 	contain non-ASCII characters in the domain field using
>> 	percent-encoding of U-labels rather than A-labels.
>> 	Because there are no manifest RtoL characters in URIs
>> 	(because there are no non-ASCII characters), there are
>> 	no RtoL-related URI display issues.  
>>   
> I was trying hard to not mention URIs in *this* message at
> all, because the question of IRIs vs URIs and
> what-occurs-where is, to my mind, both very knotty and deeply
> irrelevant to the question I am trying to ask, which is all
> about what the lookup process of a domain name with U-labels
> in it is permitted to do. So I'll ignore this issue on this
> particular thread.

Ok, but, because of the issue that follows, these aren't quite
separable.

>> 	(3) Some referral/indirection URIs constitute an
>> 	interesting challenge.  Regardless of what current (and
>> 	draft) versions of the URI and IRI may say (or be
>> 	construed as saying), the domain-part of a URI is
>> 	clearly a "domain name slot" as that term is defined in
>> 	IDNA2003 (the definition in IDNA2008 is no different,
>> 	but I want to stress that this is not a new decision).
>> 	As such, it is expected to contain a U-label or A-label
>...
>> 	So one can have the "running text" problem, with or
>> 	without RtoL characters, even inside a URL and without
>> 	worrying about "paragraphs".
>> 	
>> 	If this is a problem for IRIs (and whether or not it is
>> 	is debatable), it is not a problem for this WG.

> Good point, and to my mind a very good example of why we
> should just focus on what happens to domain names when they
> are displayed as if they were plain text.

>> Now, while I have never been an advocate of positions like "we
>> can't address all of the cases and solve all of the problems,
>> therefore we should do nothing", I'm finding that this leads
>> me to a position close to Alireza's conclusion (if I
>> understand that conclusion correctly).  However, I also see
>> zone policies and registration procedures as an important
>> part of the protocol.  To me, that means removing all of the
>> normative language from the bulleted paragraph above and
>> replacing it with some lavish advice that points out the
>> nasty things that can happen when naive (or not-so-naive)
>> rendering engines display labels containing certain types of
>> characters in certain positions next to labels containing
>> certain other types of characters.  I think that advice
>> should explain the cases, give examples, and (i) indicate
>> that administrators of zones that contain RtoL characters in
>> labels or that point into such zones (via CNAME, DNAME, and
>> maybe URI-containing NAPTR records) ought to be very careful
>> what they do and wish for lest massive user confusion and
>> astonishment occur and (ii) that applications software that
>> renders these strings in native-character form (certainly
>> including URI-> IRI conversion and display programs) ought to
>> be very sensitive to these issues as well, perhaps contriving
>> to warn users that what they are seeing might not be what
>> they might expect to see.  
>>   
> I think the text you are asking for is already present in the
> document, but would like your suggestions for improving the
> text.

I will try to review it this weekend in the light of things I
think I understand now but clearly did not understand going into
the Dublin meeting.

> So I'll conclude that you think I should remove the bulleted
> point. Is that a correct interpretation?

yes.  see above.

>> Much as I'd like to do more, I don't see a path that would
>> permit us to do so.
> If your conclusion that the bullet I've called out, as it is
> now written, will cause the document to be blocked
> indefinitely is correct, I agree. I would, however, like to
> have the people who hold that position to explain their
> position.

Again, see Andrew's note as a starting point.  If we can make
those problems go away somehow, my position on this softens.
But, perhaps unlike Andrew, I do not believe that changing the
terminology in Protocol will help at all with the underlying
problem.

    john