Question

I have installed a Tomcat7 server on Amazon EC2. One of my Servlets receive a "file" as a multipart post. I need to store this files in a directory structure and later the file will be retrived by another servlet and send to a client and deleted from the Amazon Web Services.

My Question Where and how should I store these file and how should I create a directory structure using Servlets.

I am looking at

  1. Quickest possible access to the file.
  2. Store of the file is need as long as it is not send to the client.
Was it helpful?

Solution

You really have two options:

1-Store the content at your local instance storage (And for your instance storage, I highly recommend EBS over instance store (see this question for the background). It would be faster to store and retrieve your files, and you can always re-scale your instance when needed.

2-Store your files at S3. Store/retrieve times will be slower, but you get "automagic" scalability, encryption, enhanced durability and availability (without putting effort into them) and the possibility to make the files publicly available, with direct links, without the need to pass through your web application. Other than that, since the files are not in a specific EC2 instance, you can scale your web application by adding new instances while keeping files centralized on S3.

My recommendation would be S3, even though you might lose some speed while delivering files. Set up both environments and have some tests. It may help you deciding.

Hope it helps.

OTHER TIPS

+1 for @Viccari's answer. It covers the options well.

However I disagree with the conclusion to use S3 because of your requirements Quickest possible access to the file and Store of the file is need as long as it is not send to the client.

S3 is notably slower than EBS based storage. Since you are storing a given file for a single client until retrieved once, you don't need the scalability that S3 would provide. In fact, it can take quite a bit before data stored in S3 replicates to other availability zones.

If the data you were storing was going to be served many times, S3 would be a more reasonable choice (as long as its performance is adequate for your need). For that use case (not for your use case), I would also layer Cloud Front on top of S3.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top