Greek Casefolding sigma
John C Klensin
klensin at jck.com
Sat Mar 29 17:03:01 CET 2008
--On Saturday, March 29, 2008 8:47 AM -0700 Mark Davis
<mark.davis at icu-project.org> wrote:
> Patrik, you misunderstand. I'm not saying that this should be
> part of the protocol. What I'm saying is that the protocol,
> *combined with a postprocessing step for UIs, *would handle
> the situation.
>
> (In retrospect -- and water long under the bridge -- we would
> have been better off to use one of the variants of Punycode,
> which has the ability to encode case and other distinguishing
> information in the original Unicode using case in the ASCII
> form. Had we gone that route, we could have maintained the
> visual distinctions on output of DNS for sigma and similar
> cases, because the DNS does a caseless compare for A-Z.)
Mark,
Unless I misunderstand what you are suggesting, that punycode
variation would not have helped. Because the code points are
different, punycode(raw-upper-case-string) is not going to
contain the same characters as
punycode(equivalent-lower-case-string). One could use punycode
case to encode things the way you suggest, but only by case
folding first and then using the punycode case to indicate "used
to be upper case". But that wouldn't help for the sigma
situation because the case folding operation itself is what
loses the information (about final form, not really case), and
that isn't subject to a binary "upper/lower" switch.
Or have I missed something in what you are suggesting?
But, one way or the other, certainly water under the bridge.
john
More information about the Idna-update
mailing list