Comments on IDNAbis protocol-03

Harald Alvestrand harald at
Mon Jan 21 07:45:42 CET 2008

Mark Davis skrev:
> > > Protocol-5.  The "Contextual Rules" need to be supplied.
> > > (What is the format? Machine readable? Are there default
> > > required ones -- there should be, for ZWJ/ZWNJ).
> >
> > Yes they need to be supplied, and as quickly as possible.  The
> > list, and the rules themselves, are presumably a job for an
> > IANA registry, probably initialized by a piece of the "tables"
> > document (since that is how related things are being done).
> > Machine readable would certainly be good, but there has been no
> > in-depth discussion yet about how to do that.  In particular it
> > is not clear whether all rules that are possible ( i.e., that
> > may be required) can be appropriately expressed in regular
> > expression form using set elements of those expressions that
> > are well-defined and persistent.  But I don't understand what
> > you intend by "default required ones".  When contextual rules
> > are required, they are required and there is no default other
> > than "treat the corresponding code point as invalid".  Could
> > you explain?
> What I mean is that certain contextual rules, like no combining mark
> at the start, or restrictions on ZWJ/ZWNJ need to be always present.
> Others may be optional, depending on the registry.
In which case they're not part of the protocol (or the protocol's
tables) at all.
There's no restriction against the registries using the same mechanisms
as the IDNAbis spec to describe their particular rules, but the IDNAbis
spec needs to contain only the global rules.
> However, we should also point out that registries may have rules that
> require access to more than just a label, such as:
>     * use folding table A to map a proposed registration to a
>       canonical form (eg simplified chinese form).
>     * if any already registered label has that form as well, reject
>       the registration
> That would allow both traditional and simplified characters, but the
> first one registered takes precedence. I mention this as an example;
> there are many other possibilities.
Have you read the JET documents and tables (RFC 3743 and the
ICANN-registered tables)?

Existing mechanism - so we already have precedent for stuff like that
existing without being mentioned in the base spec.


More information about the Idna-update mailing list