IDNA online tool

Mark Davis mark at macchiato.com
Fri Apr 10 17:49:38 CEST 2009


Mark


On Fri, Apr 10, 2009 at 04:01, "Martin J. Dürst" <duerst at it.aoyama.ac.jp>wrote:

> Hello Mark,
>
> You seem to be right; I forwarded a bug report to Opera.
>
> On 2009/04/10 12:48, Mark Davis wrote:
>
>> The M-Label string is \uFECB\uFEAE\uFE91\uFEF2
>> The U-Label string is \u0639\u0631\u0628\u064A
>>
>
> I guess U+ notation would be better; nobody needs to know you used Java
> :-).


I can do that if you think it would be less confusing. Or for that matter
use &#x...; notation or \x{...} notation. The only problem with U+ notation
is that you have to leave spaces around it if you mix with unescaped text;
the other notations were designed with the ability to intermix with other
text.


>
>
>  The M-Label does map to the U-Label (in IDNA, and in IDNAbis unless we
>> restrict the mapping to exclude these characters).
>>
>
> I and others have proposed that we should restrict the mapping for these,
> unless we get feedback from the Arabic script community otherwise.


I'll send out another message on this.

>
>
>  Every conformant browser, where fonts are available, should give
>> essentially
>> the same rendering for both strings (which is why they are mapped together
>> by NFKC).
>>
>
> Well, no. On the one hand, NFKC equivalences often clearly render
> differently (think about circled numbers,...), and on the other hand,
> compatibility Arabic only renders the same as the normalized version if the
> connectivity variants are carefully selected to follow the connectivity
> rules of the Arabic script. As an example, the NFKC form of
> ﺶﺷ is شش, which doesn't look the same.


That is true, and I didn't say it was the only reason. The key is that when
the presentation forms are used, they'll typically be used with the right
shapes for the particular word, which means that if someone cuts and pastes
a particular word that looks identical to another, they'll be confused why
one works and the other doesn't. I agree that that is not the case for many
other NFKC cases.

>
>
> Regards,   Martin.
>
> --
> #-# Martin J. Dürst, Professor, Aoyama Gakuin University
> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst at it.aoyama.ac.jp
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090410/a3d4619a/attachment.htm 


More information about the Idna-update mailing list