Archaic scripts (was: Re: New version:draft-ietf-idna-tables-01.txt)

John C Klensin
Thu May 8 11:00:38 CEST 2008

--On Wednesday, 07 May, 2008 17:05 -0700 Kenneth Whistler
wrote:

> Vint,
> If you're in the midst of trying to declare consensus on this,
> I'll chime in quickly to indicate that I also totally disagree.
> Michel has the right of this, and I'm not finding the arguments
> at all cogent or convincing for redefining all the characters
> for archaic scripts as PVALID instead of DISALLOWED.
> By the way, this has nothing whatsoever to do with permitting
> archaic *languages*..... this is all about including or
> not including characters for archaic *scripts*.
> Archaic (and dead) languages can usually be written just
> fine with existing modern scripts -- examples: Classical Latin
> (language) with the Latin script, Attic Greek (language)
> with the Greek script, Middle Chinese (language) with
> the Han script, and so on...
> And so people are tending to mix up issues of language
> revival with issues of writing system revival. The two
> are completely distinct.

I agree that the mix-up occurs.   But it isn't our mix up and
some of the people who are arguing for availability of the
original scripts are fairly passionate about it.   In
particular, one set of examples you did not use involves African
and pre-Columbian American languages.

> There are good, solid reasons why *all* of the cuneiform
> writing systems are long dead and will *stay* dead. They
> are writing systems for vanished technology: using styluses
> to imprint wedges in handheld clay tablets. They were replaced,
> thousands of years ago by modern technology, like ink and
> papyrus! Nobody is going to revive cuneiform writing --
> and the purpose of cuneiform in Unicode is to enable
> computerization of cuneiform corpuses in a text form, not
> to assist in cuneiform revivalist movements.
> IMO, inclusion of cuneiform (and many other long-dead,
> ancient scripts for archaic writing systems) in IDNs is
> just silly.

We are having a battle of examples.  You (and Michel) are
picking examples that no one expects to see in IDNs and that, on
a script by script basis, no one cares about seeing in IDNs.
Cuneiform scripts or Linear-B (or, for that matter, Linear-A)
are clearly in that category.  Maybe Runic --also probably more
suitable for incisions into wood or stone than pens, brushes,
and computer typesetting-- falls into that category too, or
maybe it doesn't (see Cary's note of some weeks ago).  Writing
systems that, at least in the minds of their contemporary
advocates and would-be restorers, were perfectly fine until
forcibly replaced by those of conquers (whether military or
religious) in relatively more historic times (8th - 10th century
forward in Europe, Africa, and the Western Hemisphere, but not
"thousands of years ago").

One of the differences (possibly a useful one, possibly not), is
that, for some of these scripts, we can identify modern
languages that, with great confidence, are very closely related
to the languages written in those classic/historical scripts.
Whatever the scripts may be, the languages are certainly not
"extinct".  For Linear-B and at least several of the Cuneiform
scripts, we've got good guesses at the language family, but,
however effectively we have been able to decode the scripts (or
not), no one has heard the original language spoken for
thousands of years and knowledge of what it sounds (or sounded)
like is, in most cases, a matter of educated guesses.

To take a particularly challenging example, Classic Mayan is now
being taught to non-specialists and taught in primary and
secondary schools, not just advanced post-secondary courses.
That makes it very different, at least IMO, from scholars
writing Sumero-Akkadian.  It would have been impossible to teach
it that way thirty or forty years ago because the understanding
of the writing system wasn't there.  It is, I believe, being
taught more as a curiosity and artistic exercise than as a
primary script, but there are people who believe it should and
will come back as the primary script for the relevant languages.
Perhaps they are deluded.  But I'm not quibbling; I just do not
know where to draw the line that seems so clear to you.

I don't expect to see Classic Mayan used in everyday
correspondence, just because of its intricacy, but, as you know,
several serious people made the claim within the last century or
two that Han was obsolete and an impediment and needed to be
replaced for the same reason.  Some of these issues are in the
eye of the beholder.  I don't know how those African writing
systems that are the subjects of study and restoration attempts
to push out colonial influences will fare, but their advocates
have not at all confused preservation of the writing system with
preservation of the language in the way that you suggest.   And,
while almost all of those languages can be transliterated into
and, in your words, "written just fine with existing modern
scripts", that isn't the point, any more than the ability to
write Chinese in pinyin implies that we don't need Han in IDNs.
We need Han/CJK in IDNs because there are significant
populations that use it, not because the language could not be
written in any other way.

Some of those recently-replaced (or recently-suppressed,
depending on your point of view) systems probably are, as Michel
suggests, at least largely pictographic-illustrative.   Some are
not.  And, in many cases, the scripts have been identified as
pictographic and then reclassified as mostly or entirely
alphabetic or syllabic as understanding has increased.

If we assume that there are some historical/archaic scripts that
are irrelevant to IDNs and certain to remain so (and we do agree
on that), my problem is that I'm trying to find a reasonable
differentiating rule that separates the ones that may be
relevant from the ones that clearly are not. If Classic Mayan
ever ends up in Unicode (I predict that it will do so in the
next decade or two if current trends continue despite a whole
series of interesting problems), then any discrimination based
on Plane 0 / Plane 1 distinctions will fail because it won't fit
in the remaining space in Plane 0.   Again, I don't know about
the African examples or others (and perhaps all of them are
purely pictographic -- although I'd be surprised), but I suspect
that some of them will end up being used  (and used in more than
illustrations in historical articles) on the Internet as well.

I may be misunderstanding what you and Michel are trying to say,
but it feels as if, rather than trying to engage on separating
historic-and-irrelevant scripts from
historic-but-possibly-relevant ones, or accepting the group and
relying on registry restrictions, we are hearing an argument
that sounds suspiciously like:

	(i) We have identified this set of scripts as historic.
	(ii) Some historic scripts clearly don't belong in IDNs.
	(iii) Therefore all historic scripts should be banned
	from IDNs.

In addition to the obvious logic flaw, I continue to believe
that there are edge cases in Unicode's classifications of
scripts as "historic" or "not historic" that would be disputed
by other reasonable and competent people.

I tried with my comments about "uncertain" to see if we could
find a defining principle that would permit us to identify and
handle the relevant scripts and leave the others aside.
Probably it doesn't work.   But I have a lot of problems with
"not cuneiform, therefore not Runic" or "not Linear-B, therefore
not Deseret".


