Question

I've got an application that I've built that, if all goes well could generate a large amount of data. At present I'm using a MySQL database to store the information and I make use of INNER and LEFT joins on queries to filter data. Now I was going to have a play with dynamodb anyway but I thought I would ask if people think it suits the following data structure, or whether I should be using a relational database.

For instance, lets say I had a project table with a project_id as a primary key. Now each "project" could have a number of users associated to it. Now when manager A logs in, he might want to see all the projects that his team members have. In the RDS model this might be structured as followed:

  **project**                      **project_to_user**
  project_id PK                    project_id
  project_title                    user_id

  select p.project_id,p.project_title from project as p inner join project_to_user as pto on p.project_id = pto.user_id WHERE pto.user_id IN( 1,2,3,4);

Now I could in theory keep a similar structure for dynamodb however, I would first have to select all project_ids from project_to_user for each user_id (A lot of reads) or possibly a scan, if user_id was a set of user_ids. Then I could select all projects based off those returned ids (possibly removing duplicates via code). Alternatively I thought I could scrap the project_to_user table and have a user_ids attribute on project and do a scan on that table. I'm aware that scans aren't the best way to go with dynamodb, but could this be offset by the face that the first method of doing this could be a lot of reads anyway?

My app doesn't have a lot of tables, which I understand makes it a good candidate for amazon dynamodb, but should I stick to the relational model?

I know this can seem quite open ended, but I'm excited by the prospect of scale DynamoDB offers, however I'm wondering if it is best fit for this sort of thing. I can however see DB management becoming a major headache down the line if I stick with the relationship model. I've already re-designed the DB to fit with the dynamodb model, but it's just these "JOIN" points that have me hesitant to make the jump and would appreciate any insights people might have.

I'm having a little play with MongoDB in terms of getting used to NoSQL but as I understand I would have to manage that set-up more than I would with Amazon DynamoDB (which is a pro for Amazon)

Many thanks

* EDIT * There could be as many searches for the user_id query as there could be for the project_id, if not more, but each project also needs to be separately identified

Was it helpful?

Solution

The thumb-rule is this - If your queries are achievable using DynamoDB, then is a good fit. Regarding joins, you need to do this in your code at application level.

If you are able to design tables in Dynamo to satisfy your queries, the the benefits of a Fully managed (zero administration) and infinite scale DB are the advantages.

Recently they supported GSI, which makes the queries lot more flexible.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top