[lsb@lsb.org: [EAI] (summary) display of RightToLeft chars in localparts and hostnames]

Harald Alvestrand harald at alvestrand.no
Thu Dec 7 09:36:22 CET 2006



--On 7. desember 2006 13:01 +0900 Soobok Lee <lsb at lsb.org> wrote:

>
> I found this section in stringprep2003:
>
> <quote from section 5.7>
>  5.8 Change display properties or are deprecated
>
>    The following characters can cause changes in display or the order in
>    which characters appear when rendered, or are deprecated in Unicode.
>
>    200E; LEFT-TO-RIGHT MARK
>    200F; RIGHT-TO-LEFT MARK
>    202A; LEFT-TO-RIGHT EMBEDDING
>    202B; RIGHT-TO-LEFT EMBEDDING
>    202C; POP DIRECTIONAL FORMATTING
>    202D; LEFT-TO-RIGHT OVERRIDE
>    202E; RIGHT-TO-LEFT OVERRIDE
>    206A; INHIBIT SYMMETRIC SWAPPING
>    206B; ACTIVATE SYMMETRIC SWAPPING
>    206C; INHIBIT ARABIC FORM SHAPING
>    206D; ACTIVATE ARABIC FORM SHAPING
> </quote>
>
> My suggestion for new stringprep200x is to move these chars
>   to "mapped to nothing lists". that is, how about deleting silently
>   them instead of prohibiting them and returning error ?

Any string that contains them will (one assumes) depend on their correct 
interpretation for correct display.

Mapping them out and letting people use the resulting string powerfully 
violates the principle of least astonishment; if I, for reasons of my own, 
choose to send in the string (in network order) <RLO> D N A R T S E V L A 
<RLO>, expecting to see the display ALVESTRAND, I will be astonished if the 
result is DNARTSEVLA.

I'll be even more surprised if someone is able to register 
<RLO>DNARTSEVLA<LRO>.com and use that in a phishing attack on 
alvestrand.com - returning an error message is IMHO Exactly The Right Thing 
To Do.

> Reason:
>   As Harald bidi draft here explains, browser/email client
>  implementors somehow should determine to settle their own preferred
>  display order of IDN bidi labels and localparts ,regardless of
>  whether or not IETF recommends some specific display order .
>
>  For their purposes, above bidi functional chars would be used
>  to surround major IRI delimiters for display preparation.
>
>  When they are copied and pasted, those u+200e~u+206d may be
>  contained in the copy buffer, and then prohibited by stringprep2003,
>  but they would better be deleted by future stringprep200x.
>
> Soobok
>
>
> ----- Forwarded message from Soobok Lee <lsb at lsb.org> -----
>
> Date: Thu, 7 Dec 2006 11:04:25 +0900
> From: Soobok Lee <lsb at lsb.org>
> To: ima at ietf.org
>
> On Thu, Dec 07, 2006 at 09:59:07AM +0900, Soobok Lee wrote:
>> On Thu, Dec 07, 2006 at 09:48:17AM +0900, Soobok Lee wrote:
>> >
>> > http://www.ietf.org/internet-drafts/draft-alvestrand-idna-bidi-00.txt
>> > (page 6) and
>> > http://www.unicode.org/reports/tr36/#Bidirectional_Text_Spoofing
>> >
>> > If you read above references, you can understand why this:
>> > (storage order)
>> >   LocalRTL at FirstRTL.SecondRTL.com
>> >
>> > (old display order)
>> >   LTRdnoceS.LTRtsriF at LTRlacoL.com
>> >
>> > ( @ and dot are neutral chars wrt RtL and LtR direction)
>> >
>> > I remember that there had been some discussion about whether
>> >   we should do "RtoL stopper chars"(this may be not the right tech
>> >   term, sorry)  around  delimiters like @ or dot.
>> >
>> > this may not have any definitive right answer, but we may have better
>> > choose  one anyway.
>>
>> one of my self-answer is "for display preparation, insert RtoL stopper
>> around  all special chars in IRI/URL".
>>
>> If we follow this:
>>
>>   (new display order)
>>    LTRlacoL at LTRtsriF.LTRdnoceS.com
>>
>>   here each  @ and dot have preceding unseen(transparent) RtoL stopper
>>   char.
>
> For RtoL stopper, we can use "LRE  dot PDF" sequence.
>
> But, we still have 3 problems:
>
> 1) the above choice would still make input-time display order still look
> as (old display order), since we can't expect input method editors for
> BIDI chars intelligently determine when to insert such surrounding
> stoppers on the running input entry form. So we should provide consistent
> user experience around all of storage order, input time order,old display
> order, and new display order. But, it won't be a trvial task.
>
> 2) Moreover, when we copy and paste (new display order) hostname and
> localpart strings,  they may contain hidden LRE and PDF chars which have
> been *prohibited* in stringprep.   (http://www.ietf.org/rfc/rfc3454.txt
> section 5 and section 5.8)
>
> When IDN hostnames contains prohibited chars, they will fail be
> stringpreped  and return an error.
> To prevent this from happening, bidi LRE /PDF should not be copied  by
> mouse operation.
>
> 3) localparts may contain  dots, for example,
> [OSAMA].[BIN].[LADEN]@free.af how to display dot-containing bidi
> localparts would complicate this problem. I guess localpart dot should
> follow the way that hostname dot does .
>
> I welcome any criticism/suggestion.
>
> Soobok
>
>>
>> Soobok
>>
>> _______________________________________________
>> IMA mailing list
>> IMA at ietf.org
>> https://www1.ietf.org/mailman/listinfo/ima
>
> _______________________________________________
> IMA mailing list
> IMA at ietf.org
> https://www1.ietf.org/mailman/listinfo/ima
> ----- End forwarded message -----
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>






More information about the Idna-update mailing list