Archaic scripts -- the Battle of Examples

John C Klensin klensin at jck.com
Mon May 12 17:25:07 CEST 2008



--On Monday, 12 May, 2008 09:21 -0400 Andrew Sullivan
<ajs at commandprompt.com> wrote:

> Dear colleagues,
> 
> On Sat, May 10, 2008 at 03:49:50PM -0400, John C Klensin wrote:
> 
>> I'm clearly a member of the "big deal" camp. 
> 
> I have been convinced by the arguments that moving things from
> DISALLOWED is a big deal.  So count me in that camp too.
> 
>> At the moment (subject to refinement and more understanding
>> and persuasion) my criterion for the gray area would put a
>> script there if it met either of the following criteria:
> 
> [. . .]
> 
> I am nevertheless uncomfortable with John's criteria for "grey
> area inclusion".  I appreciate John's reasoning for them, and
> if I had to pick some criteria, these would probably be them.
> But I am not convinced that this working group, or indeed the
> IETF in general, really has the broad participation of
> relevant anthropological and linguistic experts to make this
> kind of judgement.  So I don't think we should make it.

In case I haven't been clear, I agree.  I don't like those
criteria either.  But, unless we have such criteria and can make
them work, I think we are stuck with either 

	* making Archaic scripts PVALID
	
	* making some set of characters from Archaic scripts
	DISALLOWED, but only on a case by case basis.

or
	
	* making some set of Archaic scripts DISALLOWED, but
	only on a case by case basis (i.e., not simply on the
	basis that some existing Unicode table says that the
	script is archaic).

I'm uncomfortable making the leap from "unsuitable for use in
programming language-type identifiers" or "not used to write any
contemporary language in recent memory" to "unsuitable for IDNs".

> What I like about the overall "internationalize LDH" approach
> is that it is conceptually simple.  It gives us some pretty
> clear guidance on the cases that are problematic.  It allows
> us to use a set of properties of scripts from some standard
> that does have the involvement of the athropological and
> linguistic experts needed to make the kinds of judgement in
> question (even if everyone doesn't always agree with the
> results).  We know where the "land mines" are in the DNS, so
> we can address those cases specifically, and just derive
> everything else.

I think that position leads to "these are letters, therefore
they are PVALID".  If so, I agree.  And, again, I think it is
wise to suggestion to zone administrators that these scripts
(based on the Unicode list or a broader one) should be avoided
unless there are strong reasons for permitting them.  Or, put
more broadly, that zone administrators should be strongly
encouraged to avoid registering characters from scripts of which
they do not have a good understanding.
 
> If we begin to deviate from this mostly-derived path, then we
> set ourselves up as somehow knowing something about what
> "should" be allowed in the DNS, on grounds of utility ("these
> historic scripts are useless, so they should be left out";
> "this particular historic script -- or code point -- even
> though categorized like the others, is different and needs to
> be allowed in").  Since different people will have different
> views on this utility, it opens us to endless discussions
> about what is in and out.

As we have already demonstrated :-(
 
> I've already argued for my very strong opinion that we should
> stick to the smallest necessary set of principles for deriving
> PVALID and DISALLOWED, erring on the side of PVALID if we
> can't be sure.  But if we're going to disallow some additional
> class of characters, let us not do it on a case by case basis. 

Concur.

    john




More information about the Idna-update mailing list