Not Folding Case (was: Re: Eszett ( was AW: Esszett, Final Sigma, ZWJ and ZWNJ)

Martin Duerst duerst at it.aoyama.ac.jp
Wed Feb 25 06:02:31 CET 2009


This is a well-known phenomenon. French people get told that there
are no accents on upper-case letters in school, and then live on with
that belief. Thereafter, they regularly see upper-case letters with
accents, but they don't realize that they might have to change their
beliefs because reading these upper-case letters with accents happens
unconciously. Some even might claim that something is wrong when
somebody shows them an example of an upper-case letter with an
accent.

The best way to confirm that accents can and do appear on upper-case
letters is to check the "Petit Robert", the most widely used
French-French dictionary.

Regards,    Martin.

P.S.: Effects such as the above are one reason why Internationalization
      is difficult. It's not enough to ask a few native speakers/writers,
      one has to find out who the real experts are and confirm things
      in the field with an open eye.

P.P.S.: The above is not just heresay, I had several such experiences
        with relatives of mine.


At 06:22 09/02/25, Mark Davis wrote:
>Moreover, the supposedly required deaccenting of uppercase French appears to be a canard. Not only is it contested by internationalization experts like Michel Suignard, but even casual browsing will find respectable usage of uppercase with accents such as:
>
><http://www.lemonde.fr/>http://www.lemonde.fr/
>
>eg on that page "LE MONDE DES S$B%F13(BIES", plus the tabs.
>
>Mark
>
>
>On Tue, Feb 24, 2009 at 13:12, Kenneth Whistler <<mailto:kenw at sybase.com>kenw at sybase.com> wrote:
>>jfc said:
>>
>>> May I add that French supports calls for uper cases NOT to be folded
>>> but to be supported as characters by their own.
>>> This means that "<http://ecole.fr>ecole.fr", "$B%F&D%%(B<http://cole.fr>cole.fr" and "Ecole.fr" are to be three
>>> different domain names.
>>
>>which is just silly. The implication of that is that the
>>following would also be different domain names:
>>
>>eCole.fr
>>ecOle.fr
>>ecoLe.fr
>>ecolE.fr
>>ECole.fr
>>EcOle.fr
>>EcoLe.fr
>>EcolE.fr
>>ECOle.fr
>>ECoLe.fr
>>EColE.fr
>>
>>etc., etc., for 32 different strings, before even starting
>>to consider the accent folding issues.
>>
>>This is incompatible both with existing ASCII domain name
>>usage *and* with IDNA 2003 domain name usage. And it
>>would result in a combinatorial bundling nightmare requiring
>>2^n items be bundled for every n Latin (or Greek or Cyrillic)
>>letter in a domain name.
>>
>>And no, you cannot get away with claiming this would only
>>apply to the first letter of a domain name, because there
>>is no mechanism in IDNA for parsing out words in domain
>>name labels, viz.:
>>
>><http://dangerecole.blogspot.com/>http://dangerecole.blogspot.com/
>>
>>as opposed to:
>>
>><http://www.ecoleprinceton.org/>http://www.ecoleprinceton.org/
>>
>>or
>>
>><http://www.ecolephilippegaulier.com/>http://www.ecolephilippegaulier.com/
>>
>>--Ken
>>
>>_______________________________________________
>>Idna-update mailing list
>><mailto:Idna-update at alvestrand.no>Idna-update at alvestrand.no
>>http://www.alvestrand.no/mailman/listinfo/idna-update
>
>_______________________________________________
>Idna-update mailing list
>Idna-update at alvestrand.no
>http://www.alvestrand.no/mailman/listinfo/idna-update


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp      mailto:duerst at it.aoyama.ac.jp    



More information about the Idna-update mailing list