New version, draft-faltstrom-idnabis-tables-02.txt, available
Michael Everson
everson at evertype.com
Wed Jun 20 10:51:41 CEST 2007
At 10:27 +0200 2007-06-20, Patrik Fältström wrote:
>>I don't think you can get away with updating
>>without human intervention, discussion, and
>>decision. The writing systems of the world are
>>not tidy.
>>
>>If you take this notion on board and embrace
>>it, I think you will be more comfortable about
>>updating to future versions of Unicode.
>
>The Unicode Consortium have already today a
>process when adding codepoints to decide on the
>property values that today exists.
Yes, they do.
>I see a big difference between:
>
> - Having the IETF use those properties and
>calculate what codepoints can be used in IDN
Um, that would be a bad idea. The IETF must work
together with the Unicode Consortium to do this
work. There must be cooperation... now and in
future... between the two organizations.
> - Having IETF ask Unicode Consortium to define a new property,
> and learn how to evaluate for every codepoint added what property
> value it should have
I don't see why IETF would be asking for new
property definitions. What properties do you have
in mind?
Again, the IETF has to build into its IDN process
a healthy liaison with the UTC. Script and
character expertise is on the side of those who
develop the UCS. That's where you can ask
questions and get clarification. I doubt that
IETF has the competence to evaluate what property
values a character "should have". That's not a
problem, so long as there is a good liaison
process.
>So, starting to have rules for individual
>codepoints will be a completely different kind
>of thing than an algorithm based on existing
>properties, and one of the reasons IDNA is
>locked today to Unicode 3.2.
Patrick.
Of course you have no particular reason to listen
to me. I am only a linguist and expert in the
writing systems of the world, who am mostly
concerned with adding new scripts and characters
to the UCS. But I have said for a long time now
that YOU CANNOT DO THIS WORK ALGORITHMICALLY. Six
months? A year? I don't know how long. You're
going to have to grasp this nettle. The writing
systems of the world can't be reduced to an
algorithm. There will ALWAYS be some exception.
That is why Ken and Mark and Michel have worked
on a table for use. You can't generate that table
from properties. You have to choose the table
based on intelligent analysis.
Perhaps IDNA is locked on Unicode 3.2 because the
IETF is uncomfortable with human-selected tables.
Well, that's IETF's problem, and it is easily
solved. Remove the pre-condition which you have
set, that "an algorithm based on existing
properties" is the way in which the table is
"generated". Then you will be able to make
progress. But it's *your* pre-condition.
>I.e. I am extremely nervous implications will be
>that IDNA200x vill be locked to Unicode 5.0
>because of this (or 5.1 or whatever fixed
>version) just like IDNA we have today.
Write into the rules a joint IETF/UTC process for
evaluating and agreeing updates as the character
set grows in time.
>I was hoping the existing properties would be
>enough for doing IDNA200x, and I have still not
>given up. If the Unicode people tell me that is
>not possible, then things changes quite
>drastically.
It is not dramatic. It is simple. You (IETF/UTC)
agree a table which has a certain content. When
Unicode 6.0 comes out, you (IETF/UTC) sit
together and agree a revised table.
>Yes, I have heard you personally (and others)
>say that one have to have exceptions here and
>there, but I have still been hoping we do not
>have to have that.
At what point will you give up this hope? You
will have to do so, I believe. In fact I think
(if you will forgive me for saying so) that *your
hope* is the primary stumbling block which has
prevented this IETF/UTC group from making
progress. If *you* give up this "hope" we should
be able to make progress.
That is my opinion. Perhaps others share it. I am
independent enough to say it out loud. I am sorry
if my opinion finds disfavour with you.
>We have sort of had this discussion before, but
>never really dived into the question of "what
>happens if we do NOT have the exceptions, how
>bad is that" and compare with the implications
>of doing inspection on codepoint level. Is it
>worth it (regardless of what path we choose)?
We are here now. We can look at the list of
characters in the table and inspect them. Did you
think an algorithm was going to know about
writing systems? PLEASE jog yourself out of your
current abstractions. Writing systems are untidy.
The table has to be selected by PEOPLE. By this
group of people.
>This is definitely the time when we should have the discussion.
OK. My take on this hasn't changed for six months
or more. But if the discussion is now, then it is
now.
--
Michael Everson * http://www.evertype.com
More information about the Idna-update
mailing list