Q3: What characters should be allowed in a revised IDNA2008 specification?

Harald Alvestrand harald at alvestrand.no
Thu Apr 2 21:52:31 CEST 2009


Mark Davis wrote:
> I agree, and I think there is rough consensus to that effect.
>
> My biggest concern as far as a transitional appendix is not the Hearts 
> and others, but the cases where IDNA2003 and IDNA2008 produce 
> *divergent* completely valid A-Labels. For that, I think we must have 
> a very good story because of security and interoperability issues.
>
> If we have M-Labels (whether full NFKC-CF-RDI or some subset), then it 
> looks like the sigma and eszett would go away, so that leaves us with 
> only two cases; ZWJ and ZWNJ. In that case the recommended 
> transitional strategy can devolve to:
>
>     * Lookup with IDNA2008. If it fails, remove any ZWJ/NJ, and try again.
>
if we go with lowercasing rather than casefolding, eszett and final 
sigma survive.

If we go with lowercasing, "CyprusTravel" in Greek can be represented 
with a final sigma for "Cyprus"; if we go with casefolding, we end up 
with "cyprustravel" using a non-final sigma.

                        Harald


More information about the Idna-update mailing list