punyspace summary

Thu May 15 14:48:53 CEST 2008

Dear all,
As far as I understand it the punyspace (punycoded namespace) is 
formed by all the labels with an "xn--" header. There are at least 
three level of tests that can be carried on it before resolving the 
domain names including labels from the punyspace.

1. punycode validity test.
     roughly half of the punycode space can resolve as valid Unicode 
strings, the other being False Unicode or funycode.

2. the non-funycode namespace can be split into :
     - disallowed code-points identified for their capacity to 
confuse users or the applications.
     - possioble code-points

3. the possible code-points can be split into:
     - permitted code-points
     - non-permitted code-points.

For each split a filtering is to occur.

In (1) the split is against the punycode process. It should therefore 
be a positive point if there was a very strict document that would be 
completed by an IETF (or Unicode, or ICANN) guaranteed program 
everyone could use for legal checking of the nature of an "xn--". The 
resulting funycodes are public domain and anyone can build private 
extensions through Sunycodes (specialised punycode like process).

In (2) the test is against an ISO 10646 based list documented by the 
IETF  (that Unicode or other organization may wish to extend).

in (3) the test is against a user defined/chosen list or process. The 
IETF should specify the possible formats and answers of such 
processes and the applications standardisers (W3C for the Web, IETF 
for SMTP, etc.) should determine the resulting application behaviour.

This should result into an international document co-signed by the 
WIPO, protecting zone managers from any legal responsibity in the use 
of a domain name, such responsibility being to the registrant. This 
agreement should also include an international clause protecting TM 
owners from their current obligation to protect their TM in "xn--" 
headed domain names. Another clause should consider the case of TM 
symbols entered as code-points in any semiotic table.

The documentation produced for the user level filtering should make 
sure it does not lead to conflicts with possibilities permitted by 
hosts.txt (aliases and aliases table).

Please correct me if I am wrong before I use this summary to explain 
the issue on http://wikidna.org which is used by some as a focal 
(some times bi-lingual) reminder.
Thank you.
jfc