Deprecated characters?

John C Klensin klensin at jck.com
Thu Jul 17 23:03:42 CEST 2008


Ken,

Let me try to restate at least have of my question in the light
of your (extremely helpful and patient, IMO) answer...

--On Thursday, 17 July, 2008 12:54 -0700 Kenneth Whistler
<kenw at sybase.com> wrote:

> Patrik asked:
> 
>> Question: Some of the codepoints that either are, or are
>> suggested to   be, deprecated are PVALID according to the
>> tables document.
>> 
>> Does that create any problems?
> 
> Short answer: No.
 
> Long answer:
>...
> U+17A3 KHMER INDEPENDENT VOWEL QAQ
> U+17A4 KHMER INDEPENDENT VOWEL QAA
> U+17D3 KHMER SIGN BATHAMASAT
>...

> Of those, two of the Khmer characters (17A3, 17D3) are
> *already* Deprecated=True in the standard, and 17A4 is
> already annotated as discouraged from use in both the
> code charts and in the text of the standard. I consider
> it quite likely to also be designated as Deprecated=True
> as a result of this PRI discussion, for consistency with
> the already-deprecated U+17A3.
>...
> Given that that is the likely outcome, the entire issue
> boils down to 3 Khmer characters.
> 
> Of those, two were encoded intended to be used for
> Pali/Sanskrit *transliteration* into Khmer, so were of
> historic, marginal use anyway, even as intended. But the
> Cambodian feedback has been that even in that marginal use,
> other characters are preferred. U+17D3 was intended as a
> combining mark used in the representation of some *very* rare
> historic lunar date symbols, and even that usage has been
> supplanted by simply encoding a complete set of the pre-formed
> symbols (none of which could be used in IDN's, anyway) --
> hence the deprecation of U+17D3.
> 
> And I guarantee you that any Khmer ccTLD registrar would never
> allow any of these three characters in a valid Khmer domain
> name registration.
>...
  
> I trust there really is no stomach for dropping all the
> progress on a table definition which can automatically be
> updated for future Unicode versions based on the generic
> properties already identified in idnabis-tables.txt.
> 
> If so, then the current status of U+17A3 and U+17D3 as
> Deprecated=True, the likely outcome that U+17A4 will also
> end up Deprecated=True, and the outside chance that
> U+0953 and U+0954 will, too, has no real impact on anything
> about the current documents for the protocol are worded,
> no potential impact on the future maintenance of the table,
> and no practical impact on registrar policies.

With the understanding that, if anyone has stomach for that
activity, I'm not among them, it would appear that we could make
a one-time decision to disallow any character with
Deprecated=True as of some (near) date, with the understanding
that characters deprecated after that date would not be affected
in any way, then or in the future.  The question is whether than
is worth the (fairly small) effort.  I infer from your note that
the answer is probably "no", but feel obligated to ask.

    john



More information about the Idna-update mailing list