[lsb@lsb.org: [EAI] (summary) display of RightToLeft chars in
localparts and hostnames]
Harald Alvestrand
harald at alvestrand.no
Thu Dec 7 10:12:31 CET 2006
Soobok Lee wrote:
> On Thu, Dec 07, 2006 at 09:36:22AM +0100, Harald Alvestrand wrote:
>
>>> 200E; LEFT-TO-RIGHT MARK
>>> 200F; RIGHT-TO-LEFT MARK
>>>
>>> My suggestion for new stringprep200x is to move these chars
>>> to "mapped to nothing lists". that is, how about deleting silently
>>> them instead of prohibiting them and returning error ?
>>>
>> Any string that contains them will (one assumes) depend on their correct
>> interpretation for correct display.
>>
>> Mapping them out and letting people use the resulting string powerfully
>> violates the principle of least astonishment; if I, for reasons of my own,
>> choose to send in the string (in network order) <RLO> D N A R T S E V L A
>> <RLO>, expecting to see the display ALVESTRAND, I will be astonished if the
>> result is DNARTSEVLA.
>>
>> I'll be even more surprised if someone is able to register
>> <RLO>DNARTSEVLA<LRO>.com and use that in a phishing attack on
>> alvestrand.com - returning an error message is IMHO Exactly The Right Thing
>> To Do.
>>
>
> Thanks for your correction. Just deleting is NOT the right answer. My thought
> was somewhat short about that. :0
>
> My new suggestion is that: stringprep processes
> <RLE>D N A R T S E V L A<PDF> ==> ALVESTRAND
> <LRE>YOD HE WOW HE<PDF> ==> HE WOW HE YOD ( in Hebrew)
> instead of just deleting or prohibiting <RLE> and <LRE>.
>
> How do you think about this "Just delete with reordering"?
> It won't complicate stringprep algorithms so much.
I suppose it's possible to execute the whole bidi algorithm of UAX#9 and
re-code the result as some kind of "normalized RTL". Is there a
normalization algorithm for bidi in Unicode?
But I don't see that it's reasonable to expect EVERY IDNA implementation
to do this - complexity is WAY higher than for many other things.
If we make a clear separation between "allowed characters on the wire"
and "advice to implementors on how they can help people recover from
weird-encoding errors", this may go into the latter part.
Harald
More information about the Idna-update
mailing list