AW: Consensus Call on Latin Sharp S and Greek Final Sigma
alexander.mayrhofer at nic.at
Mon Nov 30 16:47:45 CET 2009
> My question about weiß.de and WEIẞ.DE stands.
> When people did things by hand, sure it was one thing to
> convert Weiß
> to WEISS or WEISZ. But we now have a capital ẞ entering the world of
> data (since Unicode 5.1) and I don't think it's unreasonable
> to think
> that that Herr Weiß may actually think of WEIẞ.DE as equivalent to
> his weiß.de.
In the base protocol, U+1E9E is DISALLOWED (like, for example, upper case Umlauts as well). However, the client are free to implement almost any mapping - and the current set of documents provides an informational mapping:
The relevant text from draft-ietf-idnabis-mappings-05 is:
o In order to map upper case characters to their lower case
equivalents (defined in section 3.13 of [Unicode51]), first map
characters to the "Lowercase_Mapping" property (the "<lower>"
entry in the second column) in
<http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt>, if any.
Then, map characters to the "Simple_Lowercase_Mapping" property
(the fourteenth column) in
<http://www.unicode.org/Public/UNIDATA/UnicodeData.txt>, if any.
If i haven't got it wrong, this operation would map U+1E9E to U+00DF, and hence achieve exactly what you want.
More information about the Idna-update