Greek Casefolding sigma

Vint Cerf vint at
Tue Apr 1 00:06:55 CEST 2008

Sorry must not have read carefully! 

----- Original Message -----
From: mark.edward.davis at <mark.edward.davis at>
To: Vint Cerf
Cc: Markus Scherer; klensin at <klensin at>; patrik at <patrik at>; panaretou.sotiris at <panaretou.sotiris at>; segred at <segred at>; idna-update at <idna-update at>
Sent: Mon Mar 31 14:16:00 2008
Subject: Re: Greek Casefolding sigma

I'm not talking about modifying PunyCode -- we've all agreed that that's out of scope for the charter. What I was thinking about was postprocessing the Punycode result to use the case of the letters in the Punycoded string to carry information about the case of the original string. This is just blue-skying -- nothing should distract from the main order of business, which is getting the charter done.


On Mon, Mar 31, 2008 at 1:10 PM, Vint Cerf <vint at> wrote:

	I suspect we will spiral into a nonconvergent path if we start modifying punycode. it is out of bounds in any case for the proposed working group chartedm

	----- Original Message -----
	From: idna-update-bounces at <idna-update-bounces at>
	To: Mark Davis <mark.davis at>
	Cc: Sotiris Panaretou <panaretou.sotiris at>; Patrik Fältström <patrik at>; John C Klensin <klensin at>; Vaggelis Segredakis <segred at>; idna-update at <idna-update at>
	Sent: Mon Mar 31 02:09:01 2008
	Subject: Re: Greek Casefolding sigma
	On Sat, Mar 29, 2008 at 7:49 PM, Mark Davis <mark.davis at> wrote:
	> The simplest mechanism would be to then take that set of bits and walk
	> through the Punycode, and for each bit in the vector changing each cased
	> letter to uppercase to represent a 1 bit, and leaving it lowercase represent
	> a 0 bit.
	I recommend against inventing a new mechanism here. Punycode already
	provides an "originally-uppercase" bit per source character. Within
	IDNA, the uppercase information could be extracted before or during
	folding, and then passed into the Punycode-encoding function.
	Unfortunately, there is only one bit per character, which as you point
	out is insufficient in some cases for precise representation of the
	original character. I am not sure if there is room to reliably extend
	the mechanism to 2 bits per character while maintaining compabibility
	and not confusing existing implementations that use the predefined
	Google Internationalization
	Idna-update mailing list
	Idna-update at

	Idna-update mailing list
	Idna-update at


More information about the Idna-update mailing list