UTC Agenda Item: IDNA proposal

Sam Vilain sam.vilain at catalyst.net.nz
Sun Nov 26 21:33:48 CET 2006

Harald Alvestrand wrote:
>> In fact, it looks like U+0A86 : આ is actually U+0A85 U+0ABE : અા, I
>> guess there needs to be a Stringprep-like normalisation step for these.
>> So, maybe U+0A86 is not needed. - eg U+0A94 : ઔ could be U+0A85 U+0ABE
>> U+0AC8 : અાૈ. This is not a perfect homograph with the Padmaa font,
>> but it is on the Unicode.org code chart.
> would it be harmful to include those, apart from the confusables problem?

What else do I need to be aware of, other than the confusables issue?

> Or do you think that they "should have had" canonical/compatibility 
> decompositions, so that they would go away under the NFKC rule?

This looks to be the case.  But, as Patrik mentioned on another strand
of this thread, it's not the IETF's job to set Unicode policy.

>> Again, U+0AD0 : ૐ is a Sanskrit symbol and its duplication at U+0950 :
>> ॐ is regrettable. Probably the Devanagari version should "win".
> by "win", do you mean that there should be a canonical decomposition of 
> U+0AD0 to U+0950?

Yes, precisely.
Sam Vilain, Systems Architect, Catalyst IT (NZ) Ltd.
phone: +64 4 499 2267 PGP ID: 0x66B25843

More information about the Idna-update mailing list