[Idna-arabicscript] Reasons for disallowing Arabic script digitmixing at the protocol level

Eric Brunner-Williams ebw at abenaki.wabanaki.net
Wed Mar 11 19:27:16 CET 2009


At Minneapolis and on the list immediately following, several people 
demanded examples of "something useful" for mixed script digits. So 
we're just repeating the mostly-governmental-contributors to Ram's 
Arabic project statement that things without meaning or elegance be 
banned in the protocol, which John was kind enough to mule back from the 
not very widely announced meeting of "Arabic Script Experts" after the 
ICANN meeting in Cairo four months ago, to which no Iranians could 
attend, and governments were much more prominent than not, to the 
Minneapolis meeting.

Now the same set of views have been offered as some sort of consensus 
document, and I think it is a consensus, of Arab Language using 
governments, that is, the Arab League, though not of typographers, 
upscale and with nice workstations and doing paid creative work for 
global ad agencies, or downscale and armed only with spray paint cans 
and imagination, working in Arabic or languages using some or all of the 
Arabic characters, or the other half of the world, governmental or not, 
which use "Arabic Script" but not Arabic.

I don't particularly mind revisiting the same ground every four months, 
but what nags me now as then is the implicit assumption that Alireza 
needs Martin's permission to play with Farsi, and that simply isn't the 
case. The same implicit assumption was present four months ago, it was 
someone else's turn to play language cop.

Rather than "ask" that the exotic user of the exotic language justify 
his or her statement of requirements to the dominant linguistic and 
cultural elite (Americans with former ARPA handles and others marginally 
associated with power), we could be asking what specific cases, for the 
engineering constraints we have to suffer (and I really do mean treating 
"." as a god damn piece of direction leaking punctuation), were we 
cannot presently state a correct specification.

The no-mixing proposal, now authored by Manal Ismail, Egypt's 
representative to ICANN Governmental Advisory Committee, can be 
accomplished in registry local policy, and protocol global policy. 
Independent of whether it is good policy or not, and my views are that 
it is an over-specification, which Manal, Ram, and the attentive readers 
of both the Norwegian and Iranian hosted mailing lists may recall and 
ignore, because it can be implemented as registry policy and because 
there exists greater scope of Arabic Script use than just the ccTLD 
registries operated by or in cooperation with the member states of the 
Arab League, that it is imprudent to make the policy global in scope, 
implemented in the protocol.

Somewhere in South West Asia someone is thinking of a new brand name and 
a marketing campaign or is tagging the side of a bus with spray paint, 
using latin characters, including digits, or glyphs that look like latin 
characters, common characters from the union of arabic and farsi and 
urdu, including digits, or glyphs that look like those characters, 
including digits, to create strings.

These are deprecated as domain names in the registries following 
governmental policy flowing from Arab League member states, and allowed 
elsewhere. Our problem is the subset which either (a) breaks input 
methods, under the assumption that we care about other peoples' broken 
code, or (b) enjoys undefined or non-unique processing outcomes from the 
current bidi algorithm, or (c) add your assertion of controlling 
rationale here.

Remember, this is not about the similarity of the glyphs for 5, 6, and 
7, that is a problem specific to the similarity of three pairs of code 
points, not all digits, and the message John muled back from Cairo last 
November, and Ram's happily pointed the IDNAbis list to as some kind of 
consensus document recycling the same message this week, is about all 
digits, not these three pairs with visually indistinguishable yet 
un-unified glyphs.


Martin Duerst wrote:
> Hello Alireza,
> Thanks for providing some background information. I agree with you
> that for each script, it's important to consider not only the needs
> of the 'main' or 'nominal' language written in that script, but also
> the needs of the (often very many and important) other languages.
> This is a bit easier for example for 'Latin' (where the language
> is long dead, but the script is extremely alive) than for some
> other cases such as Arabic.
> One specific question:
> Your mail seems to imply that prohibiting Arabic script digit mixing
> on the protocol level might create problems for some languages, maybe
> in particular in Iran. Is this the case? Can you give examples?
> Or are you simply thinking that the Arabic script digit mixing
> problem can be solved on the registry level without any special
> provisions in the protocol? (This would be my current position
> based on the (admitedly limited) information that I have.)
> Regards,   Martin.
> At 07:23 09/03/11, Alireza Saleh wrote:
>> Dear Ram,
>> I am really shocked about the text you've sent to IETF as 'ASIWG consensus'.
>> Let me explain:
>> 1. How was such consensus reached?! As far as I know, in the last two
>> ASIWG meetings no representative of non-Arab language communities using
>> the so-called Arabic script were present. In Cairo, Egypt did not issue us
>> visas, and no ASIWG meeting was scheduled for Mexico, you just gathered
>> whoever you could(all of them Arabic language speakers).
>> 2. We received a text from you this morning our time and before we could
>> react, you send it to IETF as consensus.
>> 3. Let me point out for those who do not know, that Arabic-language
>> speakers represent less than half of the population that uses Arabic
>> script as native script.
>> 4. We have had hot discussions on ASIWG list about numerals without
>> reaching consensus.
>> 5. The text you have sent lacks technical merit; the discussions in the
>> IDNA list are well above this in technical detail and sophistication.
>> There is nothing in the text that dictates a technical decision at the
>> protocol level. All concerns expressed can be easily handled at the
>> registry level.
>> For the reference of the latest discussion about mixing digits please look 
>> at idna-arabicscript mailing list archive for November of 2008
>> http://lists.irnic.ir/pipermail/idna-arabicscript/2008-November/000317.html
>> Regards,
>> Alireza
>> Ram Mohan wrote:
>>> As I wrote yesterday, attached to this note is output from the ASIWG 
>>> (Arabic Script IDN Working Group)'s drafting team on Arabic script 
>>> digit mixing.  The discussion centers on a "no digit mixing" 
>>> philosophy to be implemented at the protocol level.
>>> Inside the ASIWG, earlier discussions centered around whether 
>>> disallowing digit mixing ought to be done at the protocol or the 
>>> application level, considering the risk and potential for harm.  A 
>>> final consensus is being worked on.
>>> -ram
>>> ------------------------------------------------------
>>> Ram Mohan
>>> email: rmohan at afilias.info <mailto:rmohan at afilias.info>
>>> office: +1.215.706.5700 | fax: +1.215.706.5701
>>> mobile: +1.215.431.0958
>>> Skype: gliderpilot30
>>> ------------------------------------------------------------------------
>>> _______________________________________________
>>> Idna-arabicscript mailing list
>>> Arabic Script IDN Working Group (ASIWG)
>>> Idna-arabicscript at lists.irnic.ir
>>> http://lists.irnic.ir/mailman/listinfo/idna-arabicscript
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

More information about the Idna-update mailing list