Historic scripts as MAYBE?
John C Klensin
klensin at jck.com
Mon Apr 28 06:07:14 CEST 2008
--On Sunday, 27 April, 2008 12:08 -0700 Paul Hoffman
<phoffman at imc.org> wrote:
> At 3:56 PM +0900 2/1/08, Martin Duerst wrote:
>> I have to say that I'm very far from sold on the concept of
>> MAYBE, but cloud it make sense to have historic scripts such
>> as Runic as MAYBE?
> The WG should revisit this topic, even with "MAYBE" being
> dead. Do we really want historic scripts as allowed characters
> for IDNs?
> The Unicode Consortium has a list of archaic/historical
> scripts that are "no longer used to write living languages".
> See chapter 14 of the Unicode standard for the full
> description. The list there is:
> Old Italic
> Linear B
> Old Persian
> All of these are expressed in blocks, and therefore could
> easily be added to the IgnorableBlocks (D) category, which
> already contains Combining Diacritical Marks for Symbols,
> Musical Symbols, Ancient Greek Musical Notation, and the
> Private Use Area.
> I propose that we add all archaic scripts, as defined by the
> Unicode Consortium, to this category.
While my conclusion is more or less the same as Mark's, my
reasoning is a bit different. FWIW...
Disallowing these scripts is very serious business, especially
given the question of how safe and easy, or not, it is to move
things from Disallowed to Protocol-Valid. If we had MAYBE, that
might be fine, but the costs of having/ re-introducing MAYBE are
high enough that I hope you aren't suggesting it just for this
handful of scripts (although it certainly is not impossible).
However, what I think is more important is that, in a world in
which one of the oft-cited justifications for IDNs is linguistic
and cultural preservation and restoration, classification of a
script as archaic by the Unicode Consortium may or may not be
appropriate from the standpoint of UNESCO or various
anthropological and archeological communities. In particular
and keeping in mind that, as I don't need to remind you, we need
to design for IDNs are all levels of the DNS tree, if one were a
research institute dedicated to one of the cultures that use one
of these languages, it would be perfectly reasonable to assign
host names in them even with no primary-language users of the
script for centuries.
It also seems to me that, in the general case, the letters,
combining marks, and digits of "archaic" scripts are no more
likely to be harmful than the letters, combining marks, and
digits of ones that see more contemporary usage. Excluding
them would be a perfectly reasonable candidate for a registry
restriction. I would imagine that no registry with a very large
registration scope and a good sense of balance and
responsibility would want to permit them. But such registry
restrictions are a very different situation from disallowing the
More information about the Idna-update