Question

I have a very simple database structure (basically only a user table) with two basic operations:

Users can log in, get an authentication token, and later submit some data with that token that is stored in this User table (only a couple of bytes).

However, many users can do this simultaneously (100k users in one minute, for instance, for a restricted period of time).

I am wondering what would be a good choice of technology. I'm not afraid to use NoSQL databases or anything, and am trying to end up with something scalable.

I've been thinking about a queueing system, and a task that populates the database.. or would I just need Amazon SimpleDB and not even bother with queueing the messages? Or do I need the RDS solution to get multiple EC2 instances talking to the "same" database? Or not amazon web services at all? Thanks for any pointers, I'm pretty new at this and want to get some insight in the various trade-offs and what's best for my application.

Was it helpful?

Solution

As PachinSV pointed out, you can choose whichever DB solution you prefer. I will add one more: you can launch an EC2 instance and install the DBMS of your choice, either NOSQL (MongoDB, Cassandra), SQL (MSSQL, MySQL, Oracle), or whatever. I will try, with this answer, to handle the other aspects of your question (the application itself, the scalability and, if needed, the storage).

My suggestion, though, would be something like:

-One or two EC2 instances, probably small or medium (take a look at the instance types here), in order to take care of your application load. More on EC2 here.
-Whenever you need to scale, you can add an Elastic Load Balancer in front of the EC2 instances, so you can keep adding instances to your ecosystem while maintaining an horizontal scalability.
-For the DB, I would start with a RDS instance (probably small size), with the system of your preference, either MySQL, SQL Server or Oracle. With RDS, you can change your instance size as you go, and you can also add one or more read-replicas in case your app becomes read-intensive in the future. More on RDS here. Another good option, as PachinSV pointed out, would be DynamoDB, for the reasons he mentioned already -- partitioning, performance, less restrictions, etc.
-Although you have not mentioned, if you need scalable storage, S3 is definitely the way to go, and would be ready for your use.

Hope it helps.

OTHER TIPS

If you have no prolem using a NoSQL database in AWS you have to options: SimpleDB and DynamoDB.

With SimpleDB you have the limitation of 10GB/Table this means you will have to worry about partitioning you data in multiple tables and you also have a restriction of 25 writes per second.

With DynamoDB all that is handled automatically and you have no restrictions in size or writes amount. And the best part is that all you data is stored in the Solid State Disks (SSDs), this helps you with a better perfomance.

And if you are a more traditional person like me you can use RDS (Relational Database Service), here you can choose between MSSQL, MySQL and Oracle

as you mentioned, in RDS you will be limited to the capacity of the largest instance they have available, and you cannot scale easily beyond that. If you end up opting for a SQL solution for ACID compliance, and you want a database service (as opposed to installing the database yourself on an EC2 instance), another MySQL option on EC2 is Xeround, which has automatic scaling so it can accommodate large numbers of concurrent users and high throughputs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top