Question

I have organized, non-relational data that is in both file system and SQL database. There is application that queries both sources.

What would be some cloud solutions for storing this data, which equates to about 1TB? I'd like to be able to migrate this data into the cloud solution and alter the application to query the data in the cloud.

So far, I've looked at AWS options: SimpleDB, DynamoDB, and MongoDB on an EC2 Intance with EBS for increased storage.

I've also looked into Azure's Table Storage.

SimpleDB has a 10GB limit. DynamoDB is on SSD and might be overkill for my needs. Did I miss something? Are MongoDB on AWS or Azure Table storage suitable options?

Was it helpful?

Solution

I think the solution depends heavily on your data access patterns.

I've used Azure Table Storage and it's great for many things. I've used DynamoDB and it's also good for quite a few things. Both are good table stores, but both have restrictions around read indexes, querying, and transactions. That's sometimes a show stopper. Both will require retooling your data and all the dependent applications.

For your file storage:

  1. (Cheapest, slowest) Migrate your files to a blob store (Azure Blob Storage or AWS S3) and leave them there. Use S3 as a drive for file access. This is slow, but cheap.
  2. (Performant) Use an EC2 instance with EBS drives and store your files there. Access the data on the local file system. This is durable and performant.

For your relational data, leave it relational and store it in a Cloud relational database server. (RDS+MySQL, RDS+SQL Server, SQL Azure, etc).

There's no need to change your applications, and their data patterns, moving to the cloud.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top