New version, draft-faltstrom-idnabis-tables-02.txt, available

Fri Jun 15 10:36:26 CEST 2007

At 03:00 07/06/15, Patrik Faltstrom wrote:
>
>On 14 jun 2007, at 10.37, Gervase Markham wrote:

>> There seem to me to be an extremely large number of characters in  
>> MAYBE YES and MAYBE NO, which corresponds to a great deal of  
>> uncertainty. I agree with Mark that this seems highly undesirable.
>>
>> Is it anticipated that more character sets will move into Stable/ Favored status before the release, thereby reducing this  
>> uncertainty? Or will it be reduced some other way?
>
>Please suggest more codepoints that you Unicode people know will  
>*ALWAYS* (and I really really mean that) will never ever change.
>
>Of course I as editor would like to see as many codepoints as  
>possible here, but...we all know "bugs" in the Unicode tables have  
>been found.
>
>I don't like the current situation either. But I am personally not  
>the one that bet my house on more scripts than what is now suggested.

I think we all agree that we want to be on the safe side. Also,
everybody knows that bugs and other problems have been found in Unicode.

But I think it's just impossible to say that no bugs or issues will
ever be found in the future in Unicode, as it's just impossible to
say that no bugs or issues will be found e.g. in a specific Internet
Protocol. One of the most important Areas in this respect within
the IETF, the Security Area, lives with the constant threat of a
new mathematical breakthrough that could make a protocol or a protocol
option obsolete in an instant, and with the continuous erroding of
security margins due to performance improvements. In all other
areas, issues and bugs in specs are found and addressed on a
continuous basis. Many of these are harmless, but occasionally not.

So what we have to do is to find a way to asses the rist, and
live with the risk. This may as a last resort include the retraction
of domain names (in the same or a similar way that the paypal issue
has been handled) if there is a serious potential for confusion that
is being exploited (rather than two legitimate separate things,
with pointers to each other of the type "see over there for that
other one"). This is of course outside of the IETF. We have to be
aware of the fact that however much we try, we will never be perfect.
Even if Unicode were perfect, there is always the possibility
that we introduce errors.

As already said, starting with Latin/Greek/Cyrillic in and
all the rest out of the STABLE (or whatever) category isn't
really appropriate. As an example, the current draft includes
all IPA characters
(024F..02AF ; ALWAYS    # LATIN SMALL LETTER Y WITH STROKE..LATIN SMALL L)
Some of these, in particular U+0251 and U+0261, are indistinguishable
from a lower case 'a' or 'g' in an italic font.

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp