Rationale problems

Harald Tveit Alvestrand harald at alvestrand.no
Sat Dec 6 16:14:34 CET 2008


John C Klensin skrev:
> --On Saturday, 06 December, 2008 08:10 +0100 Harald Tveit
> Alvestrand <harald at alvestrand.no> wrote:
>
>   
>>  >I may be just banging my head against a brick wall here, but
>> nobody has been >willing to step up to the plate to say that
>> "this causes me problems because of >situation X". No concrete
>> examples have been cited. And if you can't give even one
>>     
>>> single example of this being a problem, then you *at least*
>>>       
>> should qualify it to >indicate that it is an opinion.
>>
>> OK, I'll bang my head in the other side of the brick wall one
>> more time.
>>
>> IF a character is DISALLOWED, and IF clients check against
>> DISALLOWED as the protocol now requires them to do before they
>> allow lookup of a domain name...
>>
>> THEN anyone who wants to use a previously DISALLOWED name has
>> to:
>> 1) Change the specification to change DISALLOWED to PVALID
>> 2) Wait until all software that he wishes to have access his
>> domain name is upgraded before he can fully utilize his domain
>> name.
>>
>> In the period of 2), there will be some people able to use his
>> new domain name, and some people who can't use it. If one of
>> the first sends the name to one of the second, they will see
>> inconsistent behaviour: What works for one person won't work
>> for the other.
>>
>> If this isn't a concrete example of a problem, I don't know
>> what is.
>>     
>
> Harald,
>
> While I completely agree with your analysis and conclusion,
> reading through it has led me to what might be an insight about
> why this keeps coming up (e.g., why it seems unclear to Mark).
>
> One could make much the same argument about not looking up
> UNASSIGNED characters.  When a new character is added to Unicode
> whose properties would cause it to be PVALID, one has to wait
> until 
> all lookup software is updated before that character is reliably
> available.
>
> There is, however, a difference.  If something is DISALLOWED, an
> explicit decision has been made, based on properties and maybe
> other considerations, to disallow it.  There is, of course, a
> possibility of getting that decisions wrong, but it is on the
> same order of likelihood of other things we disregarded, or been
> encouraged to disregard, on the basis that it so unlikely,
> especially given the costs of changing our minds, that it will
> "never happen".
>
> The difference is that, with unassigned characters, we have no
> guarantees about the properties the code point would have is
> assigned to a character except what can be deduced from block
> location.  We cannot know for sure that it won't require
> contextual rules, that it will not decompose into some existing
> character or set of characters under NFC, and so on.  In
> principle, we can't even know whether it will have some
> prohibited general property (such as being a symbol), although
> block locations may provide a reliable hint about that.   So we
> have almost no choice other than to ban them at lookup time to
> be sure we do not need to make future incompatible changes.
> That implies considerable delays between when Unicode adopts a
> new character and it becomes fully useful, but I think we have
> to live with it.  In the case of DISALLOWED characters, we don't
> and shouldn't.
The other difference is that when a character is UNASSIGNED in the 
user's current Unicode installation, he can use it for exactly nothing, 
so it will not be surprising to him that he can't use it in domain names 
either.
For instance, he won't be able to see it in an e-mail containing an URL 
that uses the newly assigned character until he upgrades his font 
libraries to support the new character.

On the other hand, if a character is DISALLOWED on his system, he might 
still see it, be able to type it and use it in other contexts. Seeing 
inconsistent behaviour for whether or not he is able to use it in a 
domain name will be more surprising. User astonishment is usually a Bad 
Thing.

                Harald




More information about the Idna-update mailing list