[Idna-arabicscript] mapping of Full Stops

Vint Cerf vint at google.com
Sun Oct 11 18:24:51 CEST 2009


Erik,

thanks for this observations.

Clearly we would not advocate use of U+064D for interchange.  Mappings
was intended to focus on non-interchange UI treatments.

I agree that this is a rather substantive change; may we hear from  
others
in IDNABIS WG please?

We are going to have to draw these last call discussions to a close  
soon.

vint


On Oct 11, 2009, at 12:03 PM, Erik van der Poel wrote:

> In my opinion, it would be premature to include U+06D4 in the IDNAbis
> mapping draft (apart from the fact that it is rather late in the Last
> Calls process to be making such a change). U+3002 has a much longer
> history in IDNA and is much more firmly established. If U+06D4 would
> be mapped to U+002E at the data interchange level (think HTML), there
> would be a period where IDNA2003 implementations and new
> implementations would resolve domain names differently. Of course, the
> IDNAbis mapping draft explicitly states that it is intended to be used
> at the UI level (e.g. keyboard input), but, frankly, I don't think we
> have much experience with IDNA implementations that distinguish
> between the UI and data interchange levels.
>
> Erik
>
> On Sun, Oct 11, 2009 at 6:28 AM, Sarmad Hussain
> <sarmad.hussain at gmail.com> wrote:
>> Thanks.
>>
>> Yes, it is a request to include U+06D4 in the document explicitly as
>> it seems possible.
>>
>> The reason it becomes important for our language community is because
>> if it is listed it will most likely be implemented by the application
>> providers (even if it may not be binding because these IDNAbis
>> documents will be read thoroughly), and if it is not listed it will
>> not be implemented (as it is difficult for application providers to
>> investigate the need for all the different language communities).  So
>> listing can make a big difference.
>>
>> regards,
>> Sarmad
>>
>>
>>
>>
>> On Sun, Oct 11, 2009 at 6:15 PM, Vint Cerf <vint at google.com> wrote:
>>> keep in mind that the Mappings document is NOT normative. It is  
>>> intended to
>>> give some ideas for localization and pre-processing. The important  
>>> point is
>>> that only U+002E will be recognized in protocol as a label  
>>> separator. For
>>> purposes of exchanging IDNs, that's important. For local contexts,  
>>> one might
>>> allow alternative full-stop inputs but these would need to be  
>>> converted to
>>> the U+002E form prior to initiating a DNS query. It would probably  
>>> be wise
>>> also to convert to U+002E for purposes of canonical exchange of  
>>> domain names
>>> with other parties.
>>> For Pete Resnick and Paul Hoffman:
>>> this email might be interpreted as a request to add U+06D4 to the  
>>> Mappings
>>> list of potential local mappings to U+002E. Have you an opinion  
>>> whether this
>>> edit would be appropriate?
>>> vint
>>>
>>>
>>> On Oct 11, 2009, at 9:52 AM, Sarmad Hussain wrote:
>>>
>>>
>>> In earlier discussions on U+06D4 (ARABIC FULL STOP), which is  
>>> necessary for
>>> Urdu as a label separator (the reasons have been given on this list
>>> earlier), it was suggested that the various full stops will not be  
>>> allowed
>>> and be mapped.  It was subsequently requested to include the mapping
>>> reference in IDNA200x documents to ensure that the application  
>>> providers
>>> incorporate it, but the request was not considered positively as  
>>> it was
>>> perhaps suggested that such recommendations can not be made part  
>>> of the
>>> protocol.  However, the recent mapping document
>>> (http://tools.ietf.org/html/draft-ietf-idnabis-mappings-04) says  
>>> on pg. 2:
>>>
>>>
>>>   4.  [I-D.ietf-idnabis-protocol] is specified such that the  
>>> protocol
>>>
>>>        acts on the indvidual labels of the domain name.  If an
>>>
>>>        implementation of this mapping is also performing the step of
>>>
>>>        separation of the parts of a domain name into labels by  
>>> using the
>>>
>>>        FULL STOP character (U+002E), the following character can be
>>>
>>>        mapped to the FULL STOP before label separation occurs:
>>>
>>>
>>>
>>>        *  IDEOGRAPHIC FULL STOP (U+3002)
>>>
>>>
>>>
>>>        There are other characters that are used as "full stops"  
>>> that one
>>>
>>>        could consider mapping as label separators, but their use  
>>> as such
>>>
>>>        has not been investigated thoroughly.
>>>
>>>
>>> If this is being explicitly done for U+3002, it could be done  
>>> explicitly for
>>> ARABIC FULL STOP (U+06D4) as well. What is the reason for not  
>>> including
>>> other such possible cases explicitly?
>>>
>>> Regards,
>>> Sarmad
>>>
>>> __________ Information from ESET NOD32 Antivirus, version of virus  
>>> signature
>>> database 3811 (20090129) __________
>>>
>>> The message was checked by ESET NOD32 Antivirus.
>>>
>>> http://www.eset.com
>>> _______________________________________________
>>> Idna-arabicscript mailing list
>>> Arabic Script IDN Working Group (ASIWG)
>>> Idna-arabicscript at lists.irnic.ir
>>> http://lists.irnic.ir/mailman/listinfo/idna-arabicscript
>>>
>>>
>>
>>
>>
>> --
>> ---------------------------------------------------
>> Sarmad Hussain
>> Professor and Head
>> Center for Research in Urdu Language Processing
>> National University of Compter and Emerging Sciences
>> B Block Faisal Town
>> Lahore, Pakistan
>>
>> Ph: +9242 111 128128
>> Fax: +9242 5165232
>> URL: www.crulp.org
>> _______________________________________________
>> Idna-update mailing list
>> Idna-update at alvestrand.no
>> http://www.alvestrand.no/mailman/listinfo/idna-update
>>



More information about the Idna-update mailing list