Greek Casefolding sigma

Mon Mar 31 22:10:59 CEST 2008

I suspect we will spiral into a nonconvergent path if we start modifying punycode. it is out of bounds in any case for the proposed working group chartedm 

----- Original Message -----
From: idna-update-bounces at alvestrand.no <idna-update-bounces at alvestrand.no>
To: Mark Davis <mark.davis at icu-project.org>
Cc: Sotiris Panaretou <panaretou.sotiris at ucy.ac.cy>; Patrik Fältström <patrik at frobbit.se>; John C Klensin <klensin at jck.com>; Vaggelis Segredakis <segred at ics.forth.gr>; idna-update at alvestrand.no <idna-update at alvestrand.no>
Sent: Mon Mar 31 02:09:01 2008
Subject: Re: Greek Casefolding sigma

On Sat, Mar 29, 2008 at 7:49 PM, Mark Davis <mark.davis at icu-project.org> wrote:
> The simplest mechanism would be to then take that set of bits and walk
> through the Punycode, and for each bit in the vector changing each cased
> letter to uppercase to represent a 1 bit, and leaving it lowercase represent
> a 0 bit.

I recommend against inventing a new mechanism here. Punycode already
provides an "originally-uppercase" bit per source character. Within
IDNA, the uppercase information could be extracted before or during
folding, and then passed into the Punycode-encoding function.

Unfortunately, there is only one bit per character, which as you point
out is insufficient in some cases for precise representation of the
original character. I am not sure if there is room to reliably extend
the mechanism to 2 bits per character while maintaining compabibility
and not confusing existing implementations that use the predefined
mechanism.

markus
-- 
Google Internationalization
_______________________________________________
Idna-update mailing list
Idna-update at alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update