AW: Consensus Call on Latin Sharp S and Greek Final Sigma

Alexander Mayrhofer alexander.mayrhofer at nic.at
Mon Nov 30 16:47:45 CET 2009


 

> My question about weiß.de and WEIẞ.DE stands.
> 
> When people did things by hand, sure it was one thing to 
> convert Weiß  
> to WEISS or WEISZ. But we now have a capital ẞ entering the world of  
> data (since Unicode 5.1) and I don't think it's unreasonable 
> to think  
> that that Herr Weiß may actually think of WEIẞ.DE as equivalent to  
> his weiß.de.

In the base protocol, U+1E9E is DISALLOWED (like, for example, upper case Umlauts as well). However, the client are free to implement almost any mapping - and the current set of documents provides an informational mapping:

The relevant text from draft-ietf-idnabis-mappings-05 is:

   o  In order to map upper case characters to their lower case
      equivalents (defined in section 3.13 of [Unicode51]), first map
      characters to the "Lowercase_Mapping" property (the "<lower>"
      entry in the second column) in
      <http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt>, if any.
      Then, map characters to the "Simple_Lowercase_Mapping" property
      (the fourteenth column) in
      <http://www.unicode.org/Public/UNIDATA/UnicodeData.txt>, if any.

If i haven't got it wrong, this operation would map U+1E9E to U+00DF, and hence achieve exactly what you want. 

--
Alex


More information about the Idna-update mailing list