Question

I am building the back-end for my web app; it would act as an API for the front-end and it will be written in Python (Flask, to be precise).

After taking some decisions regarding design and implementation, I got to the database part. And I started thinking whether NoSQL data storage may be more appropriate for my project than traditional SQL databases. Following is a basic functionality description which should be handled by the database and then a list of pros and cons I could come up with regarding to which type of storage should I opt for. Finally some words about why I have considered RethinkDB over other NoSQL data storages.

Basic functionality of the API

The API consists of only a few models: Artist, Song, Suggestion, User and UserArtists.

I would like to be able to add a User with some associated data and link some Artists to it. I would like to add Songs to Artists on request, and also generate a Suggestion for a User, which will contain an Artist and a Song.

Maybe one of the most important parts is that Artists will be periodically linked to Users (and also Artists can be removed from the system -- hence from Users too -- if they don't satisfy some criteria). Songs will also be dynamically added to Artists. All this means is that Users don't have a fixed set of Artists and nor do Artists have a fixed set of Songs -- they will be continuously updating.

Pros

for NoSQL:

  • Flexible schema, since not every Artist will have a FacebookID or Song a SoundcloudID;
  • While a JSON API, I believe I would benefit from the fact that records are stored as JSON;
  • I believe the number of Songs, but especially Suggestions will raise quite a bit, hence NoSQL will do a better job here;

for SQL:

  • It's fixed schema may come in handy with relations between models;
  • Flask has support for SQLAlchemy which is very helpful in defining models;

Cons

for NoSQL:

  • Relations are harder to implement and updating models transaction-like involves a bit of code;
  • Flask doesn't have any wrapper or module to ease things, hence I will need to implement some kind of wrapper to help me make the code more readable while doing database operations;
  • I don't have any certainty on how should I store my records, especially UserArtists

for SQL:

  • Operations are bulky, I have to define schemas, check whether columns have defaults, assign defaults, validate data, begin/commit transactions -- I believe it's too much of a hassle for something simple like an API;

Why RethinkDB?

I've considered RehinkDB for a possible implementation of NoSQL for my API because of the following:

  • It looks simpler and more lightweight than other solutions;
  • It has native Python support which is a big plus;
  • It implements table joins and other things which could come in handy in my API, which has some relations between models;
  • It is rather new, and I see a lot of implication and love from the community. There's also the will to continuously add new things that leverage database interaction.

All these being considered, I would be glad to hear any advice on whether NoSQL or SQL is more appropiate for my needs, as well as any other pro/con on the two, and of course, some corrections on things I haven't stated properly.

Was it helpful?

Solution

I'm working at RethinkDB, but that's my unbiased answer as a web developer (at least as unbiased as I can).

  • Flexible schema are nice from a developer point of view (and in your case). Like you said, with something like PostgreSQL you would have to format all the data you pull from third parties (SoundCloud, Facebook etc.). And while it's not something really hard to do, it's not something enjoyable.

  • Being able to join tables, is for me the natural way of doing things (like for user/userArtist/artist). While you could have a structure where a user would contain artists, it is going to be unpleasant to use when you will need to retrieve artists and for each of them a list of users.

The first point is something common in NoSQL databases, while JOIN operations are more a SQL databases thing. You can see RethinkDB as something providing the best of each world.

I believe that developing with RethinkDB is easy, fast and enjoyable, and that's what I am looking for as a web developer.

There is however one thing that you may need and that RethinkDB does not deliver, which is transactions. If you need atomic updates on multiple tables (or documents - like if you have to transfer money between users), you are definitively better with something like PostgreSQL. If you just need updates on multiple tables, RethinkDB can handle that.

And like you said, while RethinkDB is new, the community is amazing, and we - at RethinkDB - care a lot about our users.

If you have more questions, I would be happy to answer them : )

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top