The real issue: interopability, and a proposal (Was: Consensus Call on Latin Sharp S and Greek Final Sigma)
Mark Davis ☕
mark at macchiato.com
Tue Dec 1 20:49:00 CET 2009
It is approximately 60, as you computed. The trillion figure was in a public
posting from July 2008, which is why we can quote it.
2009/12/1 Harald Alvestrand <harald at alvestrand.no>
> Mark Davis ☕ wrote:
>> As far as Harald's back-of-the-envelope calculations go, they present a
>> very inaccurate picture of the scale. Here are some more exact figures for
>> that data.
>> 1. 819,600,672 = sample size of documents
>> 2. 5,000 = links with eszed in the sample
>> 3. 1,000,000,000,000 = total documents in index (2008)
>> 4. 1,220 = scaling factor (= total docs / sample size)
>> 5. 6,100,532 = estimated total links with eszed (= scaling *
>> sample eszed links)
>> Even this has to be taken with a certain grain of salt, since (a) it is
>> assuming that the sample is representative (although we have reasonable
>> confidence in that), and (b) it doesn't weight the "importance" of the links
>> (in terms of the number of times they are followed), and (c) this data was
>> collected back in Nov 2008, so we've had another year of growth since then.
> I obviously need a bigger envelope :-) - I didn't think we had one trillion
> documents in the 2008 index.
> One missing number: how many links per document?
> Obviously #eszed links / #documents can't be the basis of the 0.00001%
> figure that Erik quoted, because 5000/819600672 = 0.00061005%, not 0.00001%,
> which is a factor of 60 larger, but if we estimate 60 links per document,
> the 0.00001% fits nicely as the percentage of links that contain eszed.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Idna-update