FW: Your statement on Identifiers and Unicode 7.0.0
John C Klensin
klensin at jck.com
Wed Feb 4 21:33:06 CET 2015
--On Wednesday, February 04, 2015 17:44 +0100 Jefsey
<jefsey at jefsey.com> wrote:
> At 07:31 03/02/2015, Abdulrahman I. ALGhadir wrote:
>> "A general rule may be extracted that combining marks should
>> not be allowed for TLDs."
> We are in agreement. This is the problem of having chosen
> Unicode instead of having deployed a non-confusagle Unigraph
> compatible table.
> The "consensus" known better. A lot of wasted time and money,
> and of unncessary irritation.
I suggest that both of you read the subthread that contains
three very long notes between Asmus and myself. Among the
things that will learn there is that a "no combining mark"
system will not work for many uses with Latin script in Unicode
or would require a _huge_ code set for many other scripts.
Similarly, while "no combining characters" will work well for
writing the Arabic language in Arabic script, it will work much
less well for several other languages written in that script
unless a lot of other precomposed characters are added. If one
considers what Unicode and IDNA call "joiners" to be combining
characters -- they certainly are in the sense that they modify
the effects and sometimes the shape of the characters associated
with the code points that precede or follow them-- then even a
wider selection of precomposed characters is insufficient.
_Please_ do not assume that you can generalized from the
characters, languages, and scripts with which you are most
familiar to everything else and to extremely broad rules. It
just doesn't work unless you are willing to give you on very
different writing systems.
As to "a non-confusagle Unigraph compatible table", I look
forward to seeing a serious and detailed proposal. Many of us
believe the notion is impossible for reasons that have at least
as much to do with human perception as with writing systems.
More information about the Idna-update