Final Sigma (was: RE: Esszett, Final Sigma, ZWJ and ZWNJ)

Erik van der Poel erikv at google.com
Mon Mar 2 17:05:44 CET 2009


On Mon, Mar 2, 2009 at 5:30 AM, JFC Morfin <jefsey at jefsey.com> wrote:
> 2009/3/1 Erik van der Poel <erikv at google.com>
>> Eszett, ZWJ and ZWNJ should not be
>> placed in the xn-- space. They should receive a different prefix. The
>> local mappers should try both (e.g. Eszett with new prefix, and ss
>> with or without xn--, depending on the rest of the string).
>
> And what if you have Eszett and ZWJ in the same label?

Some registries might disallow Eszett and ZWJ in the same label, and
IDNA2008 might not allow Eszett and ZWJ to be adjacent due to
contextual rules for ZWJ, but they could receive the same prefix, say,
xo--.

>> And if the French really want to distinguish ecole and Ecole, they
>> will not only need a different prefix, but also something other than
>> Punycode (as far as I can tell).
>
> really?
> why?

The current DNS does not allow ecole.fr and Ecole.fr to resolve to
different IP addresses. So you will have to encode these somehow. This
means that you need a prefix. You cannot use xn-- because that is for
Punycode. You cannot use Punycode because it does not encode ASCII. It
just copies the ASCII characters directly into the output. For
example, français is encoded as franais-xxa. Note the directly copied
ASCII characters "franais". Appendix A of Punycode (RFC 3492) allows
you to encode upper/lower case. So you might have xn--Franais-xxa.fr,
but this would resolve to the same IP address as xn--franais-xxa.fr.
So you need something other than Punycode that does not copy ASCII
characters directly to the output.

Erik


More information about the Idna-update mailing list