New version: draft-ietf-idna-tables-01.txt

JFC Morfin jefsey at
Wed May 7 17:29:47 CEST 2008


On 00:36 07/05/2008, John C Klensin said:
>--On Monday, 05 May, 2008 18:16 -0400 Vint Cerf
><vint at> wrote:
> > I do not believe we had consensus on the historic scripts -
> > just a discussion.
> >
> > There seem to be more than ample ways to advertise the
> > existence of texts using these scripts without the need to
> > instantiate the scripts in DNS.
>Let me suggest a different theory:
>(1) The letters and digits of the historic scripts are not, in
>any way, less letters and digits than those of scripts that are
>more actively used.  There doesn't seem to be any disagreement
>about that.
>(2) As a group, the characters of the historic scripts are no
>more likely to cause serious confusion or descriptive problems
>than the letters and digits of more actively used scripts.  I
>don't believe there is any disagreement about that either.
>(3) Scripts and languages are classified as "historical" or
>"archaic" using criteria for which there is little consensus in
>the larger community (e.g., I suspect there are difference of
>opinion between parts of the linguistic community and parts of
>the cultural preservation one).  If one classifies on the basis
>of number of living primary-language speakers, one gets one
>list.  If one does so on the basis of a count of
>primary-language native speakers within some recent period of
>time, one gets different lists... and arguments about what
>period of time should be used.  If one adds "who are also
>literate in the written form of the language", then one gets yet
>other lists.  If one evaluates IDN-appropriateness on the basis
>of how many people use the script on a daily basis today (with
>"use" being reading and/or writing), then several of those
>archaic scripts have significant more users than some
>contemporary ones.
>Worse for our purposes, some scripts that were clearly of only
>historical interest a decade or two ago are being resurrected
>and taught in schools.  They are probably still a curiosity
>today, but some would predict that they would become significant
>enough in another decade or so to require reclassifying them
>(remembering that reclassification from DISALLOWED to
>Protocol-Valid is going to be more or less a big deal that
>should be avoided if possible.
>I also don't see making an exclusion of "archaic scripts in
>Plane 1".  While I don't personally expect any of the scripts
>that are there now to be used in many IDNs, I'm also looking
>toward the future.  In that future, I don't see room in the BMP
>for even one script with more than a few handfuls of characters
>in it (if I interpret the Unicode 5.1 tables correctly, there is
>only one block of about 260 characters left, probably only 255
>after allowances for block integrity.  Even one large-ish script
>and there will be no choice but to use Plane 1 space.
>To me, what this adds up to is that...
>         (i) A restriction on historic or archaic scripts will
>         require us to make another rule that we don't otherwise
>         need, a rule that is based on blocks or enumerated
>         script names, not the properties we are otherwise using.
>         Keeping things as simple as possible argues that we
>         should have as many rules as we need, but no more.  And
>         I don't think we need this one.
>         (ii) A restriction on historic or archaic scripts is
>         likely to embroil us in arguments with scholarly,
>         research, and cultural preservation and reconstruction
>         communities that we don't really need to have unless
>         there are substantive benefits to be gained from
>         excluding these characters at the protocol level.  And
>         there are no such benefits.
>         (iii) Imposing this restriction and disallowing these
>         scripts an the characters they contain raises the odds
>         of ever having to move a significant number of
>         characters from DISALLOWED to PVALID.   It is very much
>         in our interest to keep the number of those cases, and
>         the odds of finding them, as few as possible, whether
>         one adopts a more restrictive or more liberal view of
>         what it takes to make the move.
>Now, in my mental list of "advice I would give zone
>administrators who were interested in my advice", the very first
>one on the list is "don't register labels that contain
>characters from any script you don't understand".  An obvious
>corollary to that would be registry restrictions banning the use
>of any of these scripts unless the zone actively serviced people
>doing work in/ with specific ones of them.  I would expect that
>the number of such zones would be very small.  But I don't think
>the case has been made for banning these letters and numbers.
>      john
>Idna-update mailing list
>Idna-update at

More information about the Idna-update mailing list