Zero-Width Joiner in draft-ietf-idnabis-tables

Gihan Dias gihan at uom.lk
Fri Aug 7 17:58:29 CEST 2009


2009-08-07 ප.ව. 12:59 දින, � ලිව්වා:
> On 2 aug 2009, at 19.08, Gihan Dias wrote:
>
>    
>> We are very keen to ensure that zero-width joiner (U+200D) is
>> *allowed*
>> for domains in the Sinhala script. So please include Sinhala in A.2.
>> Also, is the Joining Type OK?
>>      
> The rule is at the moment the following:
>
> If Canonical_Combining_Class(Before(cp)) .eq. Virama Then True;
>
> This implies you can use zero-with joiner after SINHALA SIGN AL-LAKUNA
> (as it has Canonical_Combining_Class that is Virama, i.e. 9).
>
> I hope that is solves the issues with Sinhala Script.
>    
Patrik,

Yes, the rules in 
http://stupid.domain.name/stuff/draft-ietf-idnabis-tables-06c.txt look 
OK (I have not checked the second rule in A.2).

How is one supposed to find this document? It is not referenced from the 
WG site, and I had to do some detective work to find it.

Mark,

I this included in your tool at http://unicode.org/cldr/utility/idna.jsp ?

Sinhala and Tamil are not in the rules

$Ndeva $deva; [\u200C\u200D] ; fail
$Nbeng $beng; [\u200C\u200D] ; fail
$Nguru $guru; [\u200C\u200D] ; fail

in IDNA CONTEXT RULES (including BIDI) of 2009/04/03 21:12:29

Should they be there? (or should the above three rules not be there?).

Thanks,

Gihan

----

Appendix A.2.  ZERO WIDTH NON-JOINER
    Code point:
       U+200C
    Overview:
       This may occur in a formally cursive script (such as Arabic) in a
       context where it breaks a cursive connection as required for
       orthographic rules, as in the Persian language, for example.  It
       also may occur in Indic scripts in a consonant conjunct context
       (immediately following a virama), to control required display of
       such conjuncts.
    Lookup:
       True
    Rule Set:
       False;
       If Canonical_Combining_Class(Before(cp)) .eq.  Virama Then True;
       If RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*\u200C
          (Joining_Type:T)*(Joining_Type:{R,D})) Then True;

Appendix A.3.  ZERO WIDTH JOINER
    Code point:
       U+200D
    Overview:
       This may occur in Indic scripts in a consonant conjunct context
       (immediately following a virama), to control required display of
       such conjuncts.
    Lookup:
       True
    Rule Set:
       False;
       If Canonical_Combining_Class(Before(cp)) .eq.  Virama Then True;






More information about the Idna-update mailing list