UTC Agenda Item: IDNA proposal

Wed Nov 22 14:28:12 CET 2006

Version that accept classes Ll, Lo and Mn can be found as

http://stupid.domain.name/idnabis/table-lllomn.html

What about class Nd?

    Patrik

On 22 nov 2006, at 14.07, Harald Alvestrand wrote:

> Class Mn contains the HEBREW POINT QAMATS that the -bidi draft is  
> busy defending. Can't eliminate that.
>
>            Harald
>
> --On 22. november 2006 13:52 +0100 Patrik Fältström  
> <patrik at frobbit.se> wrote:
>
>> I have recreated the tables using a new algorithm (based on inputfrom
>> Kenneth mostly).
>>
>> (1) Use the scripts.txt file for the script definitions, do not  
>> usethe
>> blocks definitions
>>
>> (2) Remove codepoints where cp != NFKC(cp)
>>
>> (3) Remove codepoints where cp != lowercase(cp)
>>
>> (4) Remove codepoints where class(cp) != "Ll"
>>
>> (5) Include codepoints that are part of US-ASCII (0-9, A-Z and a-z)
>>
>> The result of doing this for U+0000 - U+FFFF can be found as
>>
>> http://stupid.domain.name/idnabis/table-ll.html
>>
>> If I instead instep 4 accept things of class both Ll and Lo, then
>> theresult can be found as
>>
>> http://stupid.domain.name/idnabis/table-lllo.html
>>
>> Please let me know what you think.
>>
>> I have this comment regarding one entry from class Lm:
>>
>>>>  | Exclude  | U+02BB | U+02BB | Lm    | MODIFIER LETTER TURNED
>>>> COMMA |
>>>>  | Exclude  | U+02BC | U+02BC | Lm    | MODIFIER LETTER
>>>> APOSTROPHE   |
>>>>
>>>
>>> As ASCII isn't directly encodable using Punycode, one of these is
>>> going
>>> to be needed to be allowed for Pacific languages, which use the
>>> apostrophe. eg, Hawaiʻi. It is often ignored, but in languages like
>>> Tongan it can make a difference.
>>
>> I have not taken this into account when creating these tables.
>>
>>      Regards, Patrik
>>
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>
>
>
>
>