Casefolding Sigma (was: Re: IDNAbis Preprocessing Draft)
Kenneth Whistler
kenw at sybase.com
Tue Jan 22 01:24:55 CET 2008
Harald wondered:
> I do wonder what your mapping tables look like for the trailing Greek
> sigma - that's the canonical case of a context dependent case-mapping,
> just as the dotless I is the canonical case of a language dependent
> case-mapping.
There's no particular need to wonder -- the answers are
right there in the data tables. CaseFolding.txt:
03A3; C; 03C3; # GREEK CAPITAL LETTER SIGMA
03C2; C; 03C3; # GREEK SMALL LETTER FINAL SIGMA
In other words, U+03A3 and U+03C2 both case fold to
U+03C3 GREEK SMALL LETTER SIGMA.
And this accounts for why, in the derivation that I posted about
a couple of weeks ago, U+03C3 is in the IDN_Always.txt
table, but U+03A3 and U+03C2 are not, but are in IDN_Never.txt
instead.
draft-faltstrom-idnabis-tables-03.txt has not yet
fully taken case folding stability into account, IMO,
so it has:
03A3 NEVER
but
03C2 ALWAYS
03C3 ALWAYS
03C2 should be NEVER, by the Category C, Casefolding rule.
--Ken
More information about the Idna-update
mailing list