[Idna-arabicscript] Re: Punycode Mixed-case annotation

JFC Morfin jefsey at jefsey.com
Fri Jul 3 03:51:35 CEST 2009


Dear all,
Will I answer Mark? No.

My prority is now with the Interplus Draft as a working document. So 
I would lack the time. But mainly this is because I could not.
What Mark talks about is not nonsense.However, (1) everyone knows it 
(2) it is by nature a different topic.
I am a multilinguist (simultaneous technical support of a language 
diversity), not a linguist (discussing individual language cases).

Mark's interest is in globalization, i.e. the system he invented to 
support foreign languages in English technologies. As such it is a 
great success. It has the technical cons and pros of being an English 
language machine-centric system.

My interest is in presentation and extended services network layers. 
This what we need in order to empower:
- without constraints every language (including English, French, 
Arabic, Chinese etc.)
- as part of semiotic (including scripts) mediatic end to end 
functions (access, addressing, delivery, routing, etc.) ,
- in every technology (including the Internet),
- in using every coding (including Unicode),
- for every application (including IDNA),
- in every environment from brain cells to outerspace identified by 
their namespaces.
This is a people centric system, to accomodate the relations of the 
people diversity in what they are and what they do. Is it a reason 
why to disregard Unicode? No!  As fas as the Internet is concerned, 
Unicode is a structured set of code-points. If IDNA users are happy 
with it, the Internet should be happy with it. If they are not, they 
should be not.

Does that mean that Unicode should be disregarded ? No!

The same as Unicode is not a reason for disregarding the DNS (this is 
the justification of IDNA). Each addresses the needs of a different 
layer. Layers need to cooperate. In order for all them to progress in 
symbiosis.

NB. what Marks discusses is not obsolete, it is just (as he documents 
it himself very well) a layer below what I discuss.
Best.
jfc



At 01:47 03/07/2009, Mark Davis wrote:
>What Jefsey suggests is all nonsense.
>
>Here is what is happening. Basically, one can make a distinction 
>between capital letters that are required linguistically (the L at 
>the start of the sentence below and the M at the start of the proper 
>name Marcel in the example below) and those that just happen to have 
>a capital form (because the sentence is set in all caps). The former 
>in French are called majuscules, and the latter capitales. From Wikipedia:
>
>La phrase : « LONGTEMPS MARCEL S’EST COUCHÉ DE BONNE HEURE » 
>est écrite en capitales, mais seule la première et la dixième 
>lettres sont majuscules. On s’en rend mieux compte si on écrit 
>cette phrase en petites capitales : « Longtemps Marcel s’est 
>couché de bonne heure ».
>
>However, that distinction is not captured in Unicode, nor in ASCII, 
>nor in any other character encodings that I know of, nor should it 
>be. There are many distinctions in the usage of characters that are 
>not, and should not be, represented in the encoding. One could just 
>as well argue that the distinction between the pronunciation of "o" 
>in "rove", "move", and "love" needs to be in the encoding, or that 
>the difference between the "." in "1.2", "etc.", or "." at the end 
>of a sentence needs to be in the encoding. That would end up with 
>scads of identical characters that people would not distinguish when 
>keying, could not distinguish in display, are not in any existing 
>data, could not be depended on in processing, but would be just a 
>marvelous opportunity for spoofing!
>Nor, of course, should anyone think of trying to capture this 
>distinction in IDNA.
>
>Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20090703/53ed346a/attachment.htm 


More information about the Idna-update mailing list