Label separators (was: Re: Urdu and SPACE, FULL STOP (Re: comments on IDNAbis: draft-faltstrom-idnabis-tables-04.txt Arabic))

Alireza Saleh saleh at nic.ir
Sun Feb 24 09:28:58 CET 2008


Dear all,

I'm also agree with Dr. Klensin and Dr.Sarmad. I think some automatic 
mapping need to be done as I suggested to Dr. Sarmad in ICANN meeting 
and I'm very happy that this considered as a solution in this list. I'm 
also thinking about ZWNJ and Space. In Persian and in some other Arabic 
script languages, ZWNJ is used to correct the glyf. I think it may be 
possible to map white space to ZWNJ again before store it to a file and 
send it to the wire. In Persian almost 80% of people don't know about 
ZWNJ and they use space instead. I think if IDNA supports this idea then 
we can ask application providers to enable this feature in their 
applications.

Best
Alireza.

Sarmad Hussain wrote:
> Dear John Klensin and all,
>
> Thank you for your comments.  I understand and agree.  This is exactly what
> I am arguing for as well, i.e. :
>
> "if you need to use a convention locally to permit easier typing of that
> character, you can substitute any convenient punctuation (or other
> disallowed) character for it... as long as it is mapped to ASCII period
> before you store it in a file or transmit it on the wire"
>  
> However, if IDN standards stop short of providing clear auxiliary
> recommendations on WHICH "convenient punctuation" to substitute and HOW
> (i.e. map which UNICODE characters onto which ASCII characters),
> applications providers like Microsoft, Mozilla, etc., tend to implement
> their own interpretation for the browsers.  Unfortunately, many user
> communities do not have experience to get their voice to these application
> providers. 
>
> So if the standards list these auxiliary recommendations, there is a likely
> chance that they will be supported by the application providers as well,
> even if language communities are not able to contact them directly.  
>
> In summary, I am not asking that 06D4 be tramitted on the wire.  I am
> suggesting that, to ensure that URDU FULL STOP is processed on application
> end, relevant IDN standards should explicity recommend that application
> providers map 06D4 onto a dot, if they see it in a domain name, before
> transmitting it on the wire.
>
>
>
> Best regards,
> Sarmad   
>
>
>  
>   
>> -----Original Message-----
>> From: John C Klensin [mailto:klensin at jck.com]
>> Sent: Saturday, February 23, 2008 10:15 PM
>> To: Sarmad Hussain
>> Cc: idna-update at alvestrand.no
>> Subject: Label separators (was: Re: Urdu and SPACE, FULL STOP (Re:
>> comments on IDNAbis: draft-faltstrom-idnabis-tables-04.txt Arabic))
>>
>> Dr. Hussain (and others),
>>
>> I've been distracted by other work for a few days, but want to
>> address the FULL STOP problem, which, as Harald pointed out, is
>> associated with a label separator issue and not an issue with
>> "tables" at all.
>>
>> The problem we face here is that the single most critical
>> consideration in looking at IDNA is that the DNS, and DNS
>> applications that are not IDNA-aware, must continue to work well
>> and predictably when confronted with IDN labels in either native
>> Unicode character or ACE form.
>>
>> Personally, I frequently wish that constraint did not exist
>> because one can imagine many interesting things that could be
>> done without it.  But the price of eliminating the constraint is
>> modifications to the DNS that would take us considerable effort
>> and probably many years to deploy.  No one wants to wait that
>> long so we are stuck with the constraint.
>>
>> For label separators, the constraint has even stronger
>> implications than it does for matching rules (I've discussed the
>> latter in another note) because applications and systems that
>> are otherwise unaware of the DNS itself (not just unaware of
>> IDNA) have to be able to parse full domain names into labels in
>> order to map back and forth between the "labels separated by
>> full stops" format that we usually see and the DNS internal
>> format (a list of labels with explicit length information).
>> Even the language of IDNA2003 about mapping of period-like
>> characters isn't sufficient to prevent those characters from
>> showing up in contexts in which they would interfere with domain
>> name parsing.  However the intent is clear, and that intent is
>> to be sure that, by the time a domain name makes it into a file
>> or out on the Internet, the things that look like full stops
>> must be translated into ASCII periods and the latter substituted.
>>
>> Oddly, this is where the "no mapping in the protocol" principle
>> of the IDNA200X proposals become very helpful.  The IDNA2003
>> version says, in essence, "these characters (and no others) are
>> considered appropriate alternative forms of label separators,
>> but you have to map them to ASCII period when you see them".
>> The IDNA200X version is equivalent to "the only valid label
>> separator on the wire or in interchange is ASCII period.
>> However, since we have prohibited all other punctuation
>> characters (other than hyphen) from ever actually appearing in a
>> domain name, if you need to use a convention locally to permit
>> easier typing of that character, you can substitute any
>> convenient punctuation (or other disallowed) character for it...
>> as long as it is mapped to ASCII period before you store it in a
>> file or transmit it on the wire".
>>
>> That is clearly not a perfect solution, but it gives you the
>> flexibility you need while preserving both global
>> interoperability and the ability for non-IDNA applications to
>> unambiguously parse domain names into labels.
>>
>>     john
>>     
>
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
>   



More information about the Idna-update mailing list