|start | find | index | login or register | edit|
by earl, 5087 days agoGoogle Research: "There's No Data Like More Data! [W]e decided to share this enormous dataset with you. We processed 1,011,582,453,213 words of running text and are publishing the counts for all 1,146,580,664 five-word sequences that appear at least 40 times. Watch for an annnouncement at the LDC [at the University of Pennsylvania], who will be distributing it soon, and then order your set of 6 DVDs."
ah, there's almost nothing you can not do, with all those machines.
4 active users
|earl.strain.at • esa3 • online for 7065 days • c'est un vanilla site|