Thai unicode code points
Patrik Fältström
paf at cisco.com
Sun Dec 10 14:02:53 CET 2006
On 10 dec 2006, at 02.25, Domain Guru wrote:
> I have just read:
>
> http://www.ietf.org/internet-drafts/draft-faltstrom-idnabis-
> tables-01.txt
>
> I am particularly interested in the Thai language, and concerned
> about entries like:
>
> Possibly | U+0E33 | U+0E4D | Lo Mn | THAI CHARACTER NIKHAHIT |
> | not | | |
> | |
> | Possibly | U+0E34 | U+0E34 | Mn | THAI CHARACTER SARA
> I |
> | not | | |
> | |
> | Possibly | U+0E35 | U+0E35 | Mn | THAI CHARACTER SARA
> II |
> | not | | |
> | |
> | Possibly | U+0E36 | U+0E36 | Mn | THAI CHARACTER SARA
> UE |
> | not | | |
> | |
> | Possibly | U+0E37 | U+0E37 | Mn | THAI CHARACTER SARA
> UEE |
> | not | | |
> | |
> | Possibly | U+0E38 | U+0E38 | Mn | THAI CHARACTER SARA
> U |
> | not | | |
> | |
> | Possibly | U+0E39 | U+0E39 | Mn | THAI CHARACTER SARA
> UU |
> | not | | |
> | |
> | Possibly | U+0E3A | U+0E3A | Mn | THAI CHARACTER
> PHINTHU |
> | not | | |
> | |
> Possibly | U+0E48 | U+0E48 | Mn | THAI CHARACTER MAI EK |
> | not | | |
> | |
> | Possibly | U+0E49 | U+0E49 | Mn | THAI CHARACTER MAI
> THO |
> | not | | |
> | |
>
> If these were excluded for IDNs, it would destroy the point of Thai
> IDNs i.e. half of all thai words wouldn't be "legal" any more.
>
> Can you tell me how I can comment on this draft, and who is
> responsible for the Thai language entries? When is a final draft
> likely to happen?
First of all, the discussion is held on the idna-update at alvestrand.no
mailing list.
Secondly, there is a new draft of the draft (!) that you can see at
the following URI:
http://stupid.domain.name/table.latest.html
There are suggestions on how to make changes, but so far it seems the
conclusion is that the tools the Unicode Consortium have made
available (the triple {script, block, class}) plus bidi properties
are not enough for classifying codepoints in what is allowed and not.
At least this is my personal thinking.
Every time I (or someone else) present a new permutation of the
triple, someone else find one or more cases which does not make sense.
In this case of Thai, look at the latest table and come with
suggestions on values of the triple that should be allowed and not
allowed, and I will adjust the latest tables.
Patrik
More information about the Idna-update
mailing list