NSM flaw?

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Fri Sep 18 10:16:25 CEST 2009



On 2009/09/14 20:59, Abdulrahman I. ALGhadir wrote:
> Hey Martin,
>
> Well as I said if the protocol cares about the display of labels then yes this issue will be a problem.	
> A sequence of same NSM will yield a one drawn character as we know NSM are drawn in same position as the predecessor character either above or below it.

As I said, this depends on the display engine. Please have a look at the 
attached gif image, which shows how your initial mail was rendered by my 
mail user agent (Thunderbird).

> Thus single fatha u064E or sequence of fathas (u064E,... etc) will result same displayed(rendered) NSM because the engine will draw them one over the other same look different code.

On some engines, that may happen. What OS/programs are you using?

> Well what u saw circle with dots is u0629 (tah mrbotah) the character which is above the dots is u064E both words have same look but different encoding

No, what I saw was just your mail. Nothing else. If you don't believe 
me, please try it out yourself (Windows Vista, Thunderbird Eudora 
version 3.01b (January 2009).

Regards,    Martin.

> This goes with other NSMs (fatha, damma, kasra, shaddah,... etc ).
>
> Thank you,
> Abdulrahman.
>
> -----Original Message-----
> From: "Martin J. Dürst" [mailto:duerst at it.aoyama.ac.jp]
> Sent: 14/Sep/2009 2:16 PM
> To: Abdulrahman I. ALGhadir
> Cc: idna-update at alvestrand.no; Arabic Scripts IDNA
> Subject: Re: NSM flaw?
>
> Hello Abdulrahman,
>
> Are you saying that there is a problem with two successive (identical)
> vowel marks (such as fatha, damma, kasra) because display engines will
> ignore the second one (because essentially, there is no point in
> indicating the vowel twice)?
>
> First, my mail agent (Thunderbird) displays both U+064E characters in
> your examples below (the second one above a (maybe dotted, but I can't
> see the actual dots) circle. But there may well be display engines that
> do what I think you say, so this may indeed be a problem.
>
> Second, while such NSM combinations (as well as much more far-fetched
> combinations of NSMs, or letters and NSMs, or letters and letters) are
> all allowed in the protocol, registries can (and in the case you point
> out most probably should) reject them. Because of the complexity of
> languages and scripts around the world, it wasn't possible to
> incorporate such restrictions (except for a few extremely crucial ones)
> into the protocol, but it would definitely be good if the cases you
> point out are documented by the group working on Arabic domain names. I
> have cc'ed the Arabic Scripts IDNA mailing list, maybe they are already
> aware of this and related issues.
>
> Regards,   Martin.
>
> On 2009/09/14 19:21, Abdulrahman I. ALGhadir wrote:
>> Hello,
>>
>> I am Abdulrahman I. Al-Ghadir from SaudiNIC. I am new to IDNA and joined the mailing list lately.
>>
>> While revision and reading the last drafts I found something which may be a flow in the protocol in draft ftp://ftp.ietf.org/internet-drafts/draft-ietf-idnabis-bidi-05.txt ,
>>
>>
>>
>> In bidi-05 “2.  The BIDI Rule”:
>>
>> “  2.  In an RTL label, only characters with the BIDI properties R, AL,
>>          AN, EN, ES, CS, ET, ON, BN and NSM are allowed.
>>
>>      3.  In an RTL label, the end of the label must be a character with
>>          BIDI property R, AL, EN or AN, followed by zero or more
>>          characters with BIDI property NSM.”
>>
>> A sequence of NSM can be represented in the label thus this may arise a problem on the display level for the label.
>>
>>
>> Assume these two labels (image attached for both words):
>>
>>
>>
>> سيارةَ          ->             u0633\u064A\u0627\u0631\u0629\u064E                     ->     xn--mgbexg9i1a
>>
>> سيارةََ          ->             u0633\u064A\u0627\u0631\u0629\u064E\u064E       ->    xn--mgbexg9i1aa
>>
>>
>>
>> http://unicode.org/cldr/utility/idna.jsp?a=%D8%B3%D9%8A%D8%A7%D8%B1%D8%A9%D9%8E%0D%0A%D8%B3%D9%8A%D8%A7%D8%B1%D8%A9%D9%8E%D9%8E&f=[%C3%9F+%CF%82+[%3AJoin_C%3A<http://unicode.org/cldr/utility/idna.jsp?a=%D8%B3%D9%8A%D8%A7%D8%B1%D8%A9%D9%8E%0D%0A%D8%B3%D9%8A%D8%A7%D8%B1%D8%A9%D9%8E%D9%8E&f=%5b%C3%9F+%CF%82+%5b%3AJoin_C%3A>
>>
>>
>>
>> As you see both words have same display but different codes which might leads to problems same goes with other because as you know NSM display is the same position and rest of NSM which are after it will be display on same position too(later NSM after first NSM displayed will be invisible), same goes to other NSMs which act in same behavior. (am I right?)
>>
>>
>>
>> I know it is abit late to arise  things like that but it may be a problem?
>>
>>
>>
>> Thank you,
>>
>> Abdulrahman.
>>
>>
>>
>>
>> -----------------------------------------------------------------------
>> تنويه:
>> هذه الرسالة و مرفقاتها (إن وجدت) تمثل وثيقة سرية قد تحتوي على معلومات تتمتع بحماية وحصانة قانونية. إذا لم تكن الشخص المعني بهذه الرسالة يجب عليك تنبيه المُرسل
>> بخطأ وصولها إليك، و حذف الرسالة و مرفقاتها (إن وجدت) من الحاسب الآلي الخاص بك. ولا يجوز لك نسخ هذه الرسالة أو مرفقاتها (إن وجدت) أو أي جزئ منها، أو
>> البوح بمحتوياتها لأي شخص أو استعمالها لأي غرض. علماً بأن الإفادات و الآراء التي تحويها هذه الرسالة تعبر فقط عن رأي المُرسل و ليس بالضرورة رأي هيئة الاتصالات و
>> تقنية المعلومات، ولا تتحمل الهيئة أي مسئولية عن الأضرار الناتجة عن هذ البريد.
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: multiple_NSMs.gif
Type: image/gif
Size: 2791 bytes
Desc: not available
Url : http://www.alvestrand.no/pipermail/idna-update/attachments/20090918/7e686919/attachment.gif 


More information about the Idna-update mailing list