Greek Casefolding sigma

John C Klensin klensin at jck.com
Sat Mar 29 17:03:01 CET 2008



--On Saturday, March 29, 2008 8:47 AM -0700 Mark Davis 
<mark.davis at icu-project.org> wrote:

> Patrik, you misunderstand. I'm not saying that this should be
> part of the protocol. What I'm saying is that the protocol,
> *combined with a postprocessing step for UIs, *would handle
> the situation.
>
> (In retrospect -- and water long under the bridge -- we would
> have been better off to use one of the variants of Punycode,
> which has the ability to encode case and other distinguishing
> information in the original Unicode using case in the ASCII
> form. Had we gone that route, we could have maintained the
> visual distinctions on output of DNS for sigma and similar
> cases, because the DNS does a caseless compare for A-Z.)

Mark,

Unless I misunderstand what you are suggesting, that punycode 
variation would not have helped.    Because the code points are 
different, punycode(raw-upper-case-string) is not going to 
contain the same characters as 
punycode(equivalent-lower-case-string).   One could use punycode 
case to encode things the way you suggest, but only by case 
folding first and then using the punycode case to indicate "used 
to be upper case".   But that wouldn't help for the sigma 
situation because the case folding operation itself is what 
loses the information (about final form, not really case), and 
that isn't subject to a binary "upper/lower" switch.

Or have I missed something in what you are suggesting?

But, one way or the other, certainly water under the bridge.

      john





More information about the Idna-update mailing list