Greek Casefolding sigma

Mark Davis mark.davis at icu-project.org
Mon Mar 31 23:16:00 CEST 2008


I'm not talking about modifying PunyCode -- we've all agreed that that's out
of scope for the charter. What I was thinking about was postprocessing the
Punycode result to use the case of the letters in the Punycoded string to
carry information about the case of the original string. This is just
blue-skying -- nothing should distract from the main order of business,
which is getting the charter done.

Mark

On Mon, Mar 31, 2008 at 1:10 PM, Vint Cerf <vint at google.com> wrote:

> I suspect we will spiral into a nonconvergent path if we start modifying
> punycode. it is out of bounds in any case for the proposed working group
> chartedm
>
> ----- Original Message -----
> From: idna-update-bounces at alvestrand.no <idna-update-bounces at alvestrand.no
> >
> To: Mark Davis <mark.davis at icu-project.org>
> Cc: Sotiris Panaretou <panaretou.sotiris at ucy.ac.cy>; Patrik Fältström <
> patrik at frobbit.se>; John C Klensin <klensin at jck.com>; Vaggelis Segredakis
> <segred at ics.forth.gr>; idna-update at alvestrand.no <
> idna-update at alvestrand.no>
> Sent: Mon Mar 31 02:09:01 2008
> Subject: Re: Greek Casefolding sigma
>
> On Sat, Mar 29, 2008 at 7:49 PM, Mark Davis <mark.davis at icu-project.org>
> wrote:
> > The simplest mechanism would be to then take that set of bits and walk
> > through the Punycode, and for each bit in the vector changing each cased
> > letter to uppercase to represent a 1 bit, and leaving it lowercase
> represent
> > a 0 bit.
>
> I recommend against inventing a new mechanism here. Punycode already
> provides an "originally-uppercase" bit per source character. Within
> IDNA, the uppercase information could be extracted before or during
> folding, and then passed into the Punycode-encoding function.
>
> Unfortunately, there is only one bit per character, which as you point
> out is insufficient in some cases for precise representation of the
> original character. I am not sure if there is room to reliably extend
> the mechanism to 2 bits per character while maintaining compabibility
> and not confusing existing implementations that use the predefined
> mechanism.
>
> markus
> --
> Google Internationalization
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>


-- 
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080331/64aa7e3e/attachment-0001.html


More information about the Idna-update mailing list