Table-building

Kenneth Whistler kenw at sybase.com
Thu Feb 1 02:54:51 CET 2007


> On 2/1/07, Kenneth Whistler <kenw at sybase.com> wrote:
> >
> >
> > > --On 14. desember 2006 18:24 -0800 Kenneth Whistler <kenw at sybase.com>
> > wrote:
> > >
> > > > The *table* itself should unambiguously be defined as
> > > > the list of characters appropriate for inclusion in
> > > > IDNA. IDNAInclusion.txt (or whatever name you like).
> > >
> 
> 
> From the above, I get the feeling that you are proposing a static list of
> characters. If, however, you are suggesting that the list of characters is
> one that is included in UCD with every new version/errata of Unicode, and is
> derived from a set of properties, then I think it makes sense.

Yes, the latter. Although because it would be tied to each
minor version of Unicode (the only points at which characters
can be added to that standard), it wouldn't change for the
more frequent update releases and wouldn't be impacted by
any intermediate accumulation of errata notices.

When I talk about a "table", I'm talking about the logical
equivalence. The logical "inclusion table" required by
the algorithm for IDNA nameprep needs to be defined somehow.
I'm not proposing we print that table in an RFC, but instead
that it be based on the partition of Unicode characters
defined by a Unicode character property defined in the UCD.

Using that property, you could, of course, print out an
actual table -- and for coding purposes, you could actually
implement it as a tabular data structure, but for
standardization purposes and the RFC specification, the
definition can be handled by reference.

> Is IDNPermitted.txt the new name for it?

Just a working name, for discussion by this group. The file could
be named anything that seems appropriate. I just picked
that name for now so it would be clear what it would be
intended for.


> > The statement of the IDNA nameprep, however it gets worked out
> > in detail, is going to need an inclusion table.
> 
> 
> But we aren't actually going to list the characters in the new spec. It'll
> just be a reference to something like:
> http://www.unicode.org/Public/UNIDATA/IDNPermitted.txt
> 
> Right?

Right.

--Ken

> 
> 
> > I just pushed up IDNPermitted.txt to demonstrate what the
> > documentation for such a binary character property could (and
> > probably would) look like, if published as part of the Unicode
> > Character Database. The property is then easy to refer to
> > and easy to implement.
> 
> 
> +1.
> 
> 
> =wil



More information about the Idna-update mailing list