Deprecated characters?

Kenneth Whistler kenw at
Thu Jul 17 23:36:30 CEST 2008

John followed up:

> > If so, then the current status of U+17A3 and U+17D3 as
> > Deprecated=True, the likely outcome that U+17A4 will also
> > end up Deprecated=True, and the outside chance that
> > U+0953 and U+0954 will, too, has no real impact on anything
> > about the current documents for the protocol are worded,
> > no potential impact on the future maintenance of the table,
> > and no practical impact on registrar policies.
> With the understanding that, if anyone has stomach for that
> activity, I'm not among them, it would appear that we could make
> a one-time decision to disallow any character with
> Deprecated=True as of some (near) date, with the understanding
> that characters deprecated after that date would not be affected
> in any way, then or in the future.  The question is whether than
> is worth the (fairly small) effort.  I infer from your note that
> the answer is probably "no", but feel obligated to ask.

Correct, I don't think it is worth adding 3 more characters
to an exception list just for this, given that the tables
now (on the basis of what I considered rather thin arguments)
allow in 1000's of useless (for IDN's) historic characters
from dead scripts. A few deprecated Khmer characters amidst that
huge pile make no effective difference.

Further, you'd just be buying another edge case for people
to pick at in the future, if the UTC decided, for whatever
reason, to add a few more characters to the Deprecated=True
list, after the IDNA tables document is released.
Adding poorly understood inconsistencies like that would not
be a net positive for the protocol definition.

Also, the "Deprecated" property for characters in the Unicode
Standard is of generally little import for others, anyway.
As Mark indicated, it doesn't mean characters are removed,
nor does it mean that you can't continue to use them. It
is different from what most standards organizations mean by "deprecated"
when they apply it to some standard or some sub-part of a
standard. By rights, it should have been named the
"Discouraged_For_Use" character property instead, because
that's how it got started in the first place. And the current
PRI exercise is simply an attempt to accumulate a few more
of the miscellaneous discouragement annotations and make them
and the property statement a bit more consistent.

Personally, I opposed the reification of this
discouragement regarding the use of some characters as
a character property in the first place, and the use of
the name "Deprecated" for it -- precisely because of the
potential for confusion in communication with other standards
developing organizations such as the IETF, W3C, and so on --
but I lost those battles in the UTC long ago.


More information about the Idna-update mailing list