Changing DISALLOWED (was Re: Reserved general punctuation)

Vint Cerf vint at google.com
Thu May 1 18:04:51 CEST 2008


Paul,

I used a phrase that apparently wasn't as clear as I thought. When I  
said "reimplementation of filtering" I simply had in mind that the  
filtering system would/might need to change to accommodate the new  
status of a previously UNASSIGNED code point. So changing the table  
used for filtering was what I was thinking when I used the term  
"reimplementation" - thanks for the clarification.

Coming back to UNASSIGNED and DISALLOWED, is it correct to say  that  
all UNASSIGNED code points are DISALLOWED (since allowing them makes  
no sense if the code point has no character associated with it).

vint



On May 1, 2008, at 11:41 AM, Paul Hoffman wrote:

> At 5:25 AM -0400 5/1/08, Vint Cerf wrote:
>> At the risk of prolonging this thread, I am assuming that  
>> DISALLOWED is a condition that makes sense only for an ASSIGNED  
>> character and that UNASSIGNED means the code point has not been  
>> assigned any meaning or character.
>
> That is not a good assumption. As Mark said yesterday:
>
> At 6:59 PM -0700 4/30/08, Mark Davis wrote:
>> In Unicode, what we've been referring to as "unassigned" (more  
>> precisely gc=Cn) means that a code point (from 0 to 10FFFF) is not  
>> assigned *to a character*. The code point may actually have  
>> properties even though it does not represent a character: it might  
>> have bidi properties, block properties, or, as in this case, be  
>> default-ignoreable or a noncharacter.
>
>
>> This suggests that anything UNASSIGNED should be rejected at the  
>> protocol level (no registration... no lookup either?).
>
> That could be true regardless of whether or not TUC had given the  
> gc=Cn properties.
>
>> Wouldn't this imply that a new revision of UNICODE that ASSIGNS a  
>> previously UNASSIGNED character may require reimplementation at  
>> protocol level of filtering since the previously UNASSIGNED code  
>> point now has properties that might allow it to be used in IDNs.
>
> Now we fall into capitalization-in-specs issues.
>
> - If TUC moves a character out of gc=Cn (that is, they assign a  
> code point to a character), and that character had the IDNA  
> property of UNASSIGNED, TUC's action would the change IDNA property  
> from UNASSIGNED to something else.
>
> - If TUC moves a character out of gc=Cn (that is, they assign a  
> code point to a character), and that character had the IDNA  
> property other than UNASSIGNED, TUC's action might or might not  
> change the IDNA property.
>
> Neither would "require reimplementation at protocol level of  
> filtering". The first would, and the second might, change the table  
> used for filtering.
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.alvestrand.no/pipermail/idna-update/attachments/20080501/b18b1f38/attachment-0001.html


More information about the Idna-update mailing list