Question

What I currently believe to be true:

  • Triplestores are all about 'linked data'; data that is decoupled from any given application and can be augmented through afferent links formed by new data. This concept drives the Web.
  • URIs allow linked data to be hosted in different heterogeneous environments.
  • For two machines to interoperate, they must both work to an agreed standard.
  • W3C have defined such standards. Specifically: RDF (for formatting) and SPARQL (for querying).

Question:

Is it possible to run a query that spans multiple heterogeneous RDF triplestores?

Example:

  • We have 3 RDF triplestores running on different triplestore software.
  • Triplestore A contains nodes which reference other nodes in triplestores B and C.
  • We run such a query that should pass over nodes from all 3 stores.
Was it helpful?

Solution

Yes, this is quite possible, and it's what the SPARQL 1.1 Federated Query W3C recommendation is about. In a SPARQL query you use the SERVICE keyword to specify different SPARQL endpoints that you want to query. An example from the linked recommendation is

PREFIX foaf:   <http://xmlns.com/foaf/0.1/>
SELECT ?name
FROM <http://example.org/myfoaf.rdf>
WHERE
{
  <http://example.org/myfoaf/I> foaf:knows ?person .
  SERVICE <http://people.example.org/sparql> { 
    ?person foaf:name ?name . } 
}

The service keyword there says that the triple ?person foaf:name ?name should be retrieved from <http://people.example.org/sparql>. In your case, you might end up with something like:

PREFIX ex: <http://example.org/>
SELECT ?person ?age ?weight ?resume WHERE {
  values ?person { ex:Bill ex:John }
  SERVICE <http://jobs.example.org/sparql>   { ?person ex:resume ?resume } 
  SERVICE <http://age.example.org/sparql>    { ?person ex:age    ?age    } 
  SERVICE <http://weight.example.org/sparql> { ?person ex:weight ?weight } 
}

You'll have to run this query somewhere, though. If you run it locally, then all three triplestores can be specified as services, but if you're running the query directly against one of them, then you could have just the other two as services. This will all depend on having some SPARQL engine that supports federated queries, of course. I expect that most do, these days, but I only have experience with Jena (which does support federated queries).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top