UTC Agenda Item: IDNA proposal
Sam Vilain
sam.vilain at catalyst.net.nz
Sun Nov 26 21:33:48 CET 2006
Harald Alvestrand wrote:
>> In fact, it looks like U+0A86 : આ is actually U+0A85 U+0ABE : અા, I
>> guess there needs to be a Stringprep-like normalisation step for these.
>> So, maybe U+0A86 is not needed. - eg U+0A94 : ઔ could be U+0A85 U+0ABE
>> U+0AC8 : અાૈ. This is not a perfect homograph with the Padmaa font,
>> but it is on the Unicode.org code chart.
>>
>
> would it be harmful to include those, apart from the confusables problem?
>
What else do I need to be aware of, other than the confusables issue?
> Or do you think that they "should have had" canonical/compatibility
> decompositions, so that they would go away under the NFKC rule?
>
This looks to be the case. But, as Patrik mentioned on another strand
of this thread, it's not the IETF's job to set Unicode policy.
>> Again, U+0AD0 : ૐ is a Sanskrit symbol and its duplication at U+0950 :
>> ॐ is regrettable. Probably the Devanagari version should "win".
>>
> by "win", do you mean that there should be a canonical decomposition of
> U+0AD0 to U+0950?
>
Yes, precisely.
--
Sam Vilain, Systems Architect, Catalyst IT (NZ) Ltd.
phone: +64 4 499 2267 PGP ID: 0x66B25843
More information about the Idna-update
mailing list