start | find | index | login or register | edit
Freitag, 4. August 2006 link

Google Research: "There's No Data Like More Data! [W]e decided to share this enormous dataset with you. We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. Watch for an annnouncement at the LDC [at the University of Pennsylvania], who will be distributing it soon, and then order your set of 6 DVDs."

ah, there's almost nothing you can not do, with all those machines.


no comments

Please log in (you may want to register first) to post comments!

powered by vanilla
echo earlZstrainYat|tr ZY @.
earl.strain.at • esa3 • online for 8662 days • c'est un vanilla site