prohibiting previously mapped and unmapped characters

Tina Dam tina.dam at icann.org
Wed Nov 29 21:31:09 CET 2006


I agree that getting some stats on the table would be a great idea...... 

> --On November 29, 2006 11:22 AM, Harald Alvestrand wrote:
> 
> > --On 29. november 2006 09:42 -0800 Erik van der Poel 
> > <erikv at google.com>
> > wrote:
> 
> > If it would help, I can take a look at Google's copies of web 
> > documents to see which characters are actually used there 
> and how many 
> > occurrences there are of each. Of course, such a sample would omit 
> > domain names used in email, but the web is quite an 
> important part of 
> > the Internet too.
> 
> I think such a listing (frequency count of characters 
> actually used in Punycoded domains that actually serve web 
> pages) would be very interesting.
> For the characters that *never* occur, it seems hard to argue 
> that a large community of present users would be hurt by 
> their omission.

-can we say: "As an initial starting point ....for the characters that
*never* occur, it seems hard to argue that a large community of present
users would be hurt by their omission..."  ?

If interesting then I can also pass a request to the gTLD regsitries and see
if they can provide some data about how many of the currently registered
IDNs would be unavailable under the new protocol limitations?

Tina


 
 
> While you're at it, perhaps you could get a count of how many 
> xn-- domains there are out there, as a percentage of the 
> total number of domains for which Google fetches web pages?
> 
> I *love* statistics :-)
> 
>                    Harald
> 
> _______________________________________________
> Idna-update mailing list
> Idna-update at alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
> 



More information about the Idna-update mailing list