Thai unicode code points

Patrik Fältström paf at cisco.com
Sun Dec 10 14:02:53 CET 2006


On 10 dec 2006, at 02.25, Domain Guru wrote:

> I have just read:
>
> http://www.ietf.org/internet-drafts/draft-faltstrom-idnabis- 
> tables-01.txt
>
> I am particularly interested in the Thai language, and concerned  
> about entries like:
>
> Possibly    | U+0E33 | U+0E4D | Lo Mn | THAI CHARACTER NIKHAHIT   |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E34 | U+0E34 | Mn    | THAI CHARACTER SARA  
> I     |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E35 | U+0E35 | Mn    | THAI CHARACTER SARA  
> II    |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E36 | U+0E36 | Mn    | THAI CHARACTER SARA  
> UE    |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E37 | U+0E37 | Mn    | THAI CHARACTER SARA  
> UEE   |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E38 | U+0E38 | Mn    | THAI CHARACTER SARA  
> U     |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E39 | U+0E39 | Mn    | THAI CHARACTER SARA  
> UU    |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E3A | U+0E3A | Mn    | THAI CHARACTER  
> PHINTHU    |
>    | not         |        |        |        
> |                           |
>  Possibly    | U+0E48 | U+0E48 | Mn    | THAI CHARACTER MAI EK     |
>    | not         |        |        |        
> |                           |
>    | Possibly    | U+0E49 | U+0E49 | Mn    | THAI CHARACTER MAI  
> THO    |
>    | not         |        |        |        
> |                           |
>
> If these were excluded for IDNs, it would destroy the point of Thai  
> IDNs i.e. half of all thai words wouldn't be "legal" any more.
>
> Can you tell me how I can comment on this draft, and who is  
> responsible for the Thai language entries? When is a final draft  
> likely to happen?

First of all, the discussion is held on the idna-update at alvestrand.no  
mailing list.

Secondly, there is a new draft of the draft (!) that you can see at  
the following URI:

http://stupid.domain.name/table.latest.html

There are suggestions on how to make changes, but so far it seems the  
conclusion is that the tools the Unicode Consortium have made  
available (the triple {script, block, class}) plus bidi properties  
are not enough for classifying codepoints in what is allowed and not.

At least this is my personal thinking.

Every time I (or someone else) present a new permutation of the  
triple, someone else find one or more cases which does not make sense.

In this case of Thai, look at the latest table and come with  
suggestions on values of the triple that should be allowed and not  
allowed, and I will adjust the latest tables.

     Patrik


More information about the Idna-update mailing list