A couple of months ago, I watched Tim Berners-Lee’s TED talk on Linked Data in which he lays out his vision and explains why Linked Data will be the next major use of the web.
During the talk he mentioned the DBpedia project which is an effort to extract structured information from the content of Wikipedia. Currently it describes 2.6 million things with 274 million facts.
So how do you access all this data? Well one of the ways is to use SPARQL (SPARQL Protocol and RDF Query Language).
For instance to find all the films directed by Peter Jackson you could run the following query:
SELECT ?director WHERE { ?director dbpedia2:director <http://dbpedia.org/resource/Peter_Jackson> }
Click here to see the results.
You could use this query:
SELECT ?film_directed_by_pj ?budget ?runtime WHERE { ?film_directed_by_pj <http://dbpedia.org/property/director> <http://dbpedia.org/resource/Peter_Jackson> . ?film_directed_by_pj <http://dbpedia.org/ontology/budget> ?budget . ?film_directed_by_pj <http://dbpedia.org/property/runtime> ?runtime }
to see if there is a correlation between budget and runtime for his films.
This query returns the list of countries in the Southern Hemisphere and their latitude and longitudes:
SELECT ?countryName ?latd ?latns ?longd ?longew WHERE { ?c rdf:type <http://dbpedia.org/ontology/Country>. ?c dbpedia2:commonName ?countryName . ?c dbpedia2:latns ?latns . ?c dbpedia2:latd ?latd . ?c dbpedia2:longd ?longd . ?c dbpedia2:longew ?longew . FILTER REGEX(?latns, "S", "i"). }
Of course DBpedia is not the only source of data, this diagram (created by Chris Bizer) shows what sources are currently available on the web:
Click here for an interactive version.
This is all pretty interesting stuff and you can see why Tim Berners-Lee and others are so excited about the concept of Linked Data.
Want to know more ? Here are some links: