New version, draft-faltstrom-idnabis-tables-02.txt, available

JFC Morfin jefsey at
Wed Jun 20 16:50:44 CEST 2007

The problem we face seem to consists in having men and computers 
writing and communicating together along a common standard: there are 
constraints on the three sides involed.
- Patrick supports the computer,
- Michael the man side,
- and Debbie the standard side.
The need is for a practical, stable and secure interintelligibility among them.

Question: where are the blocking factors we did not identified (we 
would have addressed them). I suggest two:

- our conceptal confusion.

We deal with countries, scripts, and languages is if they were the 
same everywhere, based on the same attributes, with the same 
characteristics, wherever they are and used. IMHO, to help clarifying 
this belongs to the Debbie's standard (metadata) side she already has 
to address with ISO 639-4/-6 and her proposed TC46 NWIP.

- the lack of conceptual interoperability between man and computer.

We need an interreadibility solution which will not change (so the 
network can constantly support it without being delayed by its 10 
years hysteresis) but extend (so Michael can support every new script 
and writing system he may discover). I think this is possible. 
However, the first problem is not with the IETF (which takes care of 
the pipe) but with Unicode/ISO (which take care of the UCS). The 
second problem will be with IETF: to support a registry architecture 
for them (no update, but dated additions). In a nutshell: the current 
UCS is for people. We need an UCS for people-and-machines that can 
span revisions.

This is a challenge. Can Unicode/ISO come with an UCS system Patrick 
can work with in using registry based algorithms? Until then I am 
afraid all Patrick/IETF can do is to match the current Unicode version.

To better explain what I mean, I introduced the example of the UCSSEC 
idea (one code for one secure glyph for several characters in 
different script/languages) to extend UCS (one code for one character 
for several general glyphs). This means a unique grapheme oriented 
table, documented with language/script oriented metadata. Michael can 
use the whole data/metadata spectrum, and Patrick use algorithms on 
the data layer only. The Internet will only have to work out an 
adapted presentation layer mechanism to rebuild the current Unicode 
set from the UCSSEC data+matadata. IDNA presently puts it at 
application level, I would prefer to put it at OPES level (on the 
inner edge) for the updates to be carried by the ISP.

This is just an example, there might be simpler and more stable solutions?

PS. I am glad to see that progressively the points I was denied are 
going through (eg. IETF/Unicode MoU would be of great help to 
formalize the different inter relational aspects and responsibilities).

At 11:12 20/06/2007, Debbie Garside wrote:
>I see both sides of this and I think there could be a compromise.  I like
>Patrik's "rules" but I can see that they will not work without some human
>intervention.  Is there a way forward that will utilise the rules as a
>starting point to produce a base list which is then revised by
>UNICODE/script experts?
>For me, as Editor if ISO 639-6, I would like to see Unicode Codepoints
>allocated to the language writing system (alpha4) code within ISO 639-6 -
>that's why I put them there! I put this to the CLDR group last year.  A lot
>of work but it would be a beautiful result.  Subsets of the codepoints
>allocated to a writing system could be created for IDN purposes.
>Best regards
> > -----Original Message-----
> > From: idna-update-bounces at
> > [mailto:idna-update-bounces at] On Behalf Of
> > Michael Everson
> > Sent: 20 June 2007 09:52
> > To: Patrik Fältström
> > Cc: idna-update at
> > Subject: Re: New version,
> > draft-faltstrom-idnabis-tables-02.txt, available
> >
> > At 10:27 +0200 2007-06-20, Patrik Fältström wrote:
> >
> > >>I don't think you can get away with updating without human
> > >>intervention, discussion, and decision. The writing systems of the
> > >>world are not tidy.
> > >>
> > >>If you take this notion on board and embrace it, I think
> > you will be
> > >>more comfortable about updating to future versions of Unicode.
> > >
> > >The Unicode Consortium have already today a process when adding
> > >codepoints to decide on the property values that today exists.
> >
> > Yes, they do.
> >
> > >I see a big difference between:
> > >
> > >  - Having the IETF use those properties and calculate what
> > codepoints
> > >can be used in IDN
> >
> > Um, that would be a bad idea. The IETF must work together
> > with the Unicode Consortium to do this work. There must be
> > cooperation... now and in future... between the two organizations.
> >
> > >  - Having IETF ask Unicode Consortium to define a new property,
> > >    and learn how to evaluate for every codepoint added what property
> > >    value it should have
> >
> > I don't see why IETF would be asking for new property
> > definitions. What properties do you have in mind?
> >
> > Again, the IETF has to build into its IDN process a healthy
> > liaison with the UTC. Script and character expertise is on
> > the side of those who develop the UCS. That's where you can
> > ask questions and get clarification. I doubt that IETF has
> > the competence to evaluate what property values a character
> > "should have". That's not a problem, so long as there is a
> > good liaison process.
> >
> > >So, starting to have rules for individual codepoints will be a
> > >completely different kind of thing than an algorithm based
> > on existing
> > >properties, and one of the reasons IDNA is locked today to
> > Unicode 3.2.
> >
> > Patrick.
> >
> > Of course you have no particular reason to listen to me. I am
> > only a linguist and expert in the writing systems of the
> > world, who am mostly concerned with adding new scripts and
> > characters to the UCS. But I have said for a long time now
> > year? I don't know how long. You're going to have to grasp
> > this nettle. The writing systems of the world can't be
> > reduced to an algorithm. There will ALWAYS be some exception.
> > That is why Ken and Mark and Michel have worked on a table
> > for use. You can't generate that table from properties. You
> > have to choose the table based on intelligent analysis.
> >
> > Perhaps IDNA is locked on Unicode 3.2 because the IETF is
> > uncomfortable with human-selected tables.
> > Well, that's IETF's problem, and it is easily solved. Remove
> > the pre-condition which you have set, that "an algorithm
> > based on existing properties" is the way in which the table
> > is "generated". Then you will be able to make progress. But
> > it's *your* pre-condition.
> >
> > >I.e. I am extremely nervous implications will be that
> > IDNA200x vill be
> > >locked to Unicode 5.0 because of this (or 5.1 or whatever fixed
> > >version) just like IDNA we have today.
> >
> > Write into the rules a joint IETF/UTC process for evaluating
> > and agreeing updates as the character set grows in time.
> >
> > >I was hoping the existing properties would be enough for doing
> > >IDNA200x, and I have still not given up. If the Unicode
> > people tell me
> > >that is not possible, then things changes quite drastically.
> >
> > It is not dramatic. It is simple. You (IETF/UTC) agree a
> > table which has a certain content. When Unicode 6.0 comes
> > out, you (IETF/UTC) sit together and agree a revised table.
> >
> > >Yes, I have heard you personally (and others) say that one
> > have to have
> > >exceptions here and there, but I have still been hoping we
> > do not have
> > >to have that.
> >
> > At what point will you give up this hope? You will have to do
> > so, I believe. In fact I think (if you will forgive me for
> > saying so) that *your
> > hope* is the primary stumbling block which has prevented this
> > IETF/UTC group from making progress. If *you* give up this
> > "hope" we should be able to make progress.
> >
> > That is my opinion. Perhaps others share it. I am independent
> > enough to say it out loud. I am sorry if my opinion finds
> > disfavour with you.
> >
> > >We have sort of had this discussion before, but never really
> > dived into
> > >the question of "what happens if we do NOT have the
> > exceptions, how bad
> > >is that" and compare with the implications of doing inspection on
> > >codepoint level. Is it worth it (regardless of what path we choose)?
> >
> > We are here now. We can look at the list of characters in the
> > table and inspect them. Did you think an algorithm was going
> > to know about writing systems? PLEASE jog yourself out of
> > your current abstractions. Writing systems are untidy.
> > The table has to be selected by PEOPLE. By this group of people.
> >
> > >This is definitely the time when we should have the discussion.
> >
> > OK. My take on this hasn't changed for six months or more.
> > But if the discussion is now, then it is now.
> > --
> > Michael Everson *
> > _______________________________________________
> > Idna-update mailing list
> > Idna-update at
> >
> >
> >
>Idna-update mailing list
>Idna-update at
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Idna-update mailing list