start | find | index | login or register | edit
Donnerstag, 20. April 2006 link

We released a first version of Wikipedia³ - Wikipedia in RDF! Thanks, System One, for the opportunity of doing such incredibly interesting and fun work. Another one of my babies leaving the safe home :)

What's Wikipedia³ you might ask? It's modelling a few aspects of the English Wikipedia in RDF. Size? Roughly 47 million triples. 1.6 GB of raw Turtle. What's described in the dataset? Currently that's basic page metadata and structural information like links between wikipedia pages and category relationships. What's it good for? Let your imagination go wild! Or take my words as a starter: readily extracted data for loads of funny and interesting network / graph analysis and visualisation tasks. Or challenge a rule-based inferencer of your choice to answer the question: "Which computer scientists were born in 1920?"

It's only three predicates describing the structural information but scaled to the sheer size of wikipedia this makes for a fascinating dataset. I've literally spent hours browsing through the links and categories network, discovering countless interesting things about the described topics and about wikipedia's overall structure.

Inspired? Turned Off? Doing interesting stuff with Wikipedia³? Found a bug? Let me know about it: . Have fun!


no comments

Please log in (you may want to register first) to post comments!

powered by vanilla
echo earlZstrainYat|tr ZY @.
earl.strain.at • esa3 • online for 6774 days • c'est un vanilla site