Protocol-08 (and status of Defs-04 and Rationale-06)

Eric Brunner-Williams ebw at abenaki.wabanaki.net
Tue Dec 9 01:47:39 CET 2008



Kenneth Whistler wrote:
> O.k., time for some renumbering and analysis here.
>
> The logical possibilities for "forbidding at protocol level"
> one or another set of the digits in question, either
> singly or in contextual combinations are:
>
> ==============================================================
>
> a. European digits (0030..0039) alone
>
> b. Arabic-Indic digits (0660..0669) alone
>
> c. Extended Arabic-Indic digits (06F0..06F9) alone
>
> d. European + Arabic-Indic digits together
>
> e. European + Extended Arabic-Indic digits together
>
> f. Arabic-Indic + Extended Arabic-Indic digits together
>
> g. European + Arabic-Indic + Extended Arabic-Indic digits together
>
> ================================================================
>
> I think we can safely assume that (a) is off the table, for
> legacy ASCII labels.
>
> I think we can safely assume that (b) and (c) are off the table,
> as well, since nobody, as far as I can recall here, has been
> calling for an across-the-board prohibition (i.e. DISALLOWED
> categorization) of Arab Arabic digits on their own, or
> Perso-Arabic Arabic digits on their own.
>
> That leaves us with (d)..(g), which are the *contextual*
> prohibitions, to prevent mixing of certain combinations of
> these digits together in a single label.
>
> Now, returning to Mark's and Eric's numbering, the options
> they had, restated are:
>
> Mark #1:  Forbid (d), (e), (f), and (g). [Actually forbidding
>           (g) would be corollary, if (d), (e), and (f) were
>           forbidden.]
>          
> Mark #2:  Forbid (d) and (e) [and (g) by corollary]. Allow (f).
>           (= Eric #5)
>
> Mark #3:  No prohibitions by protocol. Handle by registry filter.
>
> Mark #4:  Forbid (f) [and (g) by corollary]. Allow (d) and (e).
>
> Eric #4a: Forbid (f) [and (g) by corollary] except for the
>           Arabic four, five, six (because not confusable).
>           Allow (d) and (e). [Actually "five" is confusable,
>           but this will be moot -- see below.]
>           
> Alright, that is what has been proposed so far. *But* we now need
> to take into account Harald's reminder that some combinations
> are already disallowed separately by the bidi rules on label
> well-formedness, quite independently of any consideration of
> CONTEXTO categorization. What the bidi rules require of label
> formation is:
>
> Bidi:     Forbid (d) and (f) [and (g) by corollary]. Allow (e).
>   

Could you point out the lines in bidi you are referring to here?

> This changes the options entirely, in my assessment. If you
> check carefully now, this takes Mark's #2 off the table.
> It also takes Mark's #4 (and Eric's variant #4a) off the table,
> as well.
>
> What we really need to decide between is:
>
> Mark #1:  Forbid (d), (e), (f), and (g).
>
> Mark #3:  No prohibitions by protocol. Handle by registry filter.
>
> And for Mark #1, since the bidi rules *already* forbid (d), (f),
> and (g), operationally what this boils down to is deciding
> whether to:
>
>     Option alpha: Add Extended Arabic-Indic digits (06F0..06F9)
>                   to CONTEXTO in tables.txt and add a context
>                   rule in Appendix A prohibiting those from
>                   cooccurring in a label with European digits
>                   (0030..0039).
>                   
>     Option beta:  Not do option alpha.
>     
> Doing anything else, in my opinion, would over-engineer and
> needlessly complicate the specification, with no net improvement
> in the end result.
>
> The advantage I see in choosing option alpha is that it
> would add a symmetry to the handling of Arabic digits, making
> the mixing of either set of them with European digits
> prohibited in labels, irrespective of bidi arcana. That is
> easier for implementers and end users to understand than
> the somewhat odd conclusion that comes simply from application
> of the bidi rules. I think option alpha is also closer to
> what the (Arab) Arabic script input has been on the topic.
>
> The advantage I see in choosing option beta is that it
> keeps the tables document a little simpler, with one less
> abstruse context rule to check for. I think option beta
> is also closer to what the (Iranian) Arabic script input has been
> on the topic, unless I have misunderstood what Alireza has
> been saying.
>
> --Ken
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>
>
>   


More information about the Idna-update mailing list