kenw at sybase.com
Tue Mar 24 19:32:36 CET 2009
> Having said that, I am quite concerned about adding yet another
> non-ASCII dot in IDNAv2 (U+2CFE COPTIC FULL STOP) because ...
Whatever the merits of the rest of the argument about
unassigned versus assigned, and so on, this piece of
text in draft-hoffman-idna2-02 is simply an error.
The existing Section 3.1 of RFC 3490 specifies *exactly* the
CJK full stops which are used as periods (full stops, i.e.
"dots") in CJK IME's. It is not, nor never has been intended
to be extended to every terminal punctuation mark in
Unicode that happens to have "FULL STOP" in its name,
no matter whether the shape of that punctuation mark is
dotlike or not. Nobody expects (or at least nobody should
expect) that all terminal punctuation of this sort from
every script inherits a label separation function for
In particular, the claim in draft-hoffman-idna2-02 that
U+2CFE COPTIC FULL STOP should be added to the list of
"dots" in Section 3.1 in RFC 3490 is just bizarre and
should be removed.
U+2CFE COPTIC FULL STOP is not a punctuation mark with a
dotlike shape. It has no better claim to that than various
other script-specific "FULL STOP" punctuation marks added
post-Unicode-3.2, including (in the same block),
U+2CFB COPTIC OLD NUBIAN FULL STOP and (in Unicode 5.1)
U+A60E VAI FULL STOP. At least the Vai full stop is sorta
dotlike, unlike the Coptic ones.
Furthermore, *this* is what Coptic looks like:
It is a category error to try to extend the interoperability
requirements for CJK (with massive current usage and with
well-known IME issues for dots) with a rather rare punctuation
mark for a historic script. If you actually look through
Coptic texts, the only reasonably common punctuation mark
amongst them is a single midline dot, which is not U+2CFE.
And no, I'm not suggesting that *that* dot be added to
Section 3.2 of RFC 3490 instead. ;-)
More information about the Idna-update