Question

I am working on a collaborative-filtering recommender system. I built such a system before in a parallel-threaded environment, querying RDF with SPARQL. This worked well, because of the parallel nature of SPARQL and RDF graphs. However, I am now working in a standard desktop PC, and am wondering if using SPARQL is still the way to go in a largely serial environment. I've looked at dotNetRDF, as I'm using C#, and am wondering if it is any more efficient than simple SQL, especially now that dotNetRDF seems to be moving away from a SQL back-end.

So as far as performance on a few threads go, SQL or dotNetRDF? Tables or graphs?

Was it helpful?

Solution

The two things are not really comparable, dotNetRDF is a programming API that provides support for a variety of storage backends in addition to a pure in-memory solution which we mainly recommend for testing and development (Disclaimer I'm the lead developer)

The different backends have a wide variety of performance characteristics so if your problem is expressible in RDF then likely there is an appropriate backend for you.

SQL is a query language, really you should be comparing SQL to SPARQL and ultimately which you chose comes down to what your data model looks like. If it's regular then you likely want to use a RDBMS and SQL, if it's irregular and/or graph like then you likely want to use a triple store and SPARQL. The two have different pros and cons as your own answer implies.

OTHER TIPS

This seems to answer it well enough. Triple Stores vs Relational Databases

Essentially, RDF is much more flexible, but expensive. Since I'm just doing collaborative filtering with data that fits pretty well into a table, I don't think I need the extra expense, as much as I like graphs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top