Protocol-08 (and status of Defs-04 and Rationale-06)

Alireza Saleh saleh at nic.ir
Wed Dec 10 20:26:42 CET 2008


I would sincerely like to see someone out there answer the following
question:


Why has the co-occurrence of AN and EN been forbidden by -bidi ? I
read that part of the document but didn’t see anything other that
visual confusion or possible re-arrangement of the label as the reason. If
all visual confusions and character sequencing problems were solved by
setting this rule, then it would make sense. However, note the following
cases:

1. <ALEF>.3.com (as I stated before)
2. <U+064A><U+0627>.com ( http://www.nic.ir/Show_Text?c=%D9%8A%D8%A7&s=14&b=ffffff7f&f=01292200&t=DejaVuSans ) 
   <U+6CC><U+0627>.com ( http://www.nic.ir/Show_Text?c=%DB%8C%D8%A7&s=14&b=ffffff7f&f=01292200&t=DejaVuSans ) (visual confusion problem).



Will the rules solve these ? Either -bidi or Context rules? Or should the
registry still add further restrictions? Obviously the registry should.
For these reasons, we believe that the case of numerals should not be
treated any differently by -bidi. I think it is better to let
registry decide how to deal with these kinds of problems. dotIR considers
the possibility of having domains like <U+062C><U+06F5><U+0665>.ir . Why
should such a domain be banned by the protocol?


Alireza







Kenneth Whistler wrote:


> ..... 
> Alright, that is what has been proposed so far. *But* we now need
> to take into account Harald's reminder that some combinations
> are already disallowed separately by the bidi rules on label
> well-formedness, quite independently of any consideration of
> CONTEXTO categorization. What the bidi rules require of label
> formation is:
>
> Bidi:     Forbid (d) and (f) [and (g) by corollary]. Allow (e).
>
> This changes the options entirely, in my assessment. If you
> check carefully now, this takes Mark's #2 off the table.
> It also takes Mark's #4 (and Eric's variant #4a) off the table,
> as well.
>
> What we really need to decide between is:
>
> Mark #1:  Forbid (d), (e), (f), and (g).
>
> Mark #3:  No prohibitions by protocol. Handle by registry filter.
>
> And for Mark #1, since the bidi rules *already* forbid (d), (f),
> and (g), operationally what this boils down to is deciding
> whether to:
>
>     Option alpha: Add Extended Arabic-Indic digits (06F0..06F9)
>                   to CONTEXTO in tables.txt and add a context
>                   rule in Appendix A prohibiting those from
>                   cooccurring in a label with European digits
>                   (0030..0039).
>                   
>     Option beta:  Not do option alpha.
>     
> Doing anything else, in my opinion, would over-engineer and
> needlessly complicate the specification, with no net improvement
> in the end result.
>
> The advantage I see in choosing option alpha is that it
> would add a symmetry to the handling of Arabic digits, making
> the mixing of either set of them with European digits
> prohibited in labels, irrespective of bidi arcana. That is
> easier for implementers and end users to understand than
> the somewhat odd conclusion that comes simply from application
> of the bidi rules. I think option alpha is also closer to
> what the (Arab) Arabic script input has been on the topic.
>
> The advantage I see in choosing option beta is that it
> keeps the tables document a little simpler, with one less
> abstruse context rule to check for. I think option beta
> is also closer to what the (Iranian) Arabic script input has been
> on the topic, unless I have misunderstood what Alireza has
> been saying.
>
> --Ken
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>   



More information about the Idna-update mailing list