New version, draft-faltstrom-idnabis-tables-02.txt, available

Wed Jun 20 10:51:41 CEST 2007

At 10:27 +0200 2007-06-20, Patrik Fältström wrote:

>>I don't think you can get away with updating 
>>without human intervention, discussion, and 
>>decision. The writing systems of the world are 
>>not tidy.
>>
>>If you take this notion on board and embrace 
>>it, I think you will be more comfortable about 
>>updating to future versions of Unicode.
>
>The Unicode Consortium have already today a 
>process when adding codepoints to decide on the 
>property values that today exists.

Yes, they do.

>I see a big difference between:
>
>  - Having the IETF use those properties and 
>calculate what codepoints can be used in IDN

Um, that would be a bad idea. The IETF must work 
together with the Unicode Consortium to do this 
work. There must be cooperation... now and in 
future... between the two organizations.

>  - Having IETF ask Unicode Consortium to define a new property,
>    and learn how to evaluate for every codepoint added what property
>    value it should have

I don't see why IETF would be asking for new 
property definitions. What properties do you have 
in mind?

Again, the IETF has to build into its IDN process 
a healthy liaison with the UTC. Script and 
character expertise is on the side of those who 
develop the UCS. That's where you can ask 
questions and get clarification. I doubt that 
IETF has the competence to evaluate what property 
values a character "should have". That's not a 
problem, so long as there is a good liaison 
process.

>So, starting to have rules for individual 
>codepoints will be a completely different kind 
>of thing than an algorithm based on existing 
>properties, and one of the reasons IDNA is 
>locked today to Unicode 3.2.

Patrick.

Of course you have no particular reason to listen 
to me. I am only a linguist and expert in the 
writing systems of the world, who am mostly 
concerned with adding new scripts and characters 
to the UCS. But I have said for a long time now 
that YOU CANNOT DO THIS WORK ALGORITHMICALLY. Six 
months? A year? I don't know how long. You're 
going to have to grasp this nettle. The writing 
systems of the world can't be reduced to an 
algorithm. There will ALWAYS be some exception. 
That is why Ken and Mark and Michel have worked 
on a table for use. You can't generate that table 
from properties. You have to choose the table 
based on intelligent analysis.

Perhaps IDNA is locked on Unicode 3.2 because the 
IETF is uncomfortable with human-selected tables. 
Well, that's IETF's problem, and it is easily 
solved. Remove the pre-condition which you have 
set, that "an algorithm based on existing 
properties" is the way in which the table is 
"generated". Then you will be able to make 
progress. But it's *your* pre-condition.

>I.e. I am extremely nervous implications will be 
>that IDNA200x vill be locked to Unicode 5.0 
>because of this (or 5.1 or whatever fixed 
>version) just like IDNA we have today.

Write into the rules a joint IETF/UTC process for 
evaluating and agreeing updates as the character 
set grows in time.

>I was hoping the existing properties would be 
>enough for doing IDNA200x, and I have still not 
>given up. If the Unicode people tell me that is 
>not possible, then things changes quite 
>drastically.

It is not dramatic. It is simple. You (IETF/UTC) 
agree a table which has a certain content. When 
Unicode 6.0 comes out, you (IETF/UTC) sit 
together and agree a revised table.

>Yes, I have heard you personally (and others) 
>say that one have to have exceptions here and 
>there, but I have still been hoping we do not 
>have to have that.

At what point will you give up this hope? You 
will have to do so, I believe. In fact I think 
(if you will forgive me for saying so) that *your 
hope* is the primary stumbling block which has 
prevented this IETF/UTC group from making 
progress. If *you* give up this "hope" we should 
be able to make progress.

That is my opinion. Perhaps others share it. I am 
independent enough to say it out loud. I am sorry 
if my opinion finds disfavour with you.

>We have sort of had this discussion before, but 
>never really dived into the question of "what 
>happens if we do NOT have the exceptions, how 
>bad is that" and compare with the implications 
>of doing inspection on codepoint level. Is it 
>worth it (regardless of what path we choose)?

We are here now. We can look at the list of 
characters in the table and inspect them. Did you 
think an algorithm was going to know about 
writing systems? PLEASE jog yourself out of your 
current abstractions. Writing systems are untidy. 
The table has to be selected by PEOPLE. By this 
group of people.

>This is definitely the time when we should have the discussion.

OK. My take on this hasn't changed for six months 
or more. But if the discussion is now, then it is 
now.
-- 
Michael Everson * http://www.evertype.com