Question

So I'm working on a website which serves user uploaded videos, and all the infrastructure is on Amazon. I'm trying to find a good working solution, here is what I'm thinking

  1. Have an EC2 Instant which handles file uploads to it's EBS Volumes, using PHP
  2. Transfer video files to S3
  3. Make Amazon Elastic Transcoder convert the video files to appropriate formats from S3, and store them back on S3
  4. Use CloudFront to serve converted video files to the public

First of all what do you think about this? Is there an easier way to achieve the same or better?

Secondly My main problem right now is transferring files to S3, I've tried s3fs to do this, but it has all sorts of weird problems on large file transfers, which made me give up s3fs.

In response to Fastest / best way copy data between S3 to EC2? someone has suggested using EBS volumes, but I'm not sure if it's possible to mount an EBS and then also have access to the same data on S3.

Any help is appreciated

Was it helpful?

Solution

For transferring files to S3, since you are already coding in PHP you should look at the PHP libraries provided by Amazon to interface to their services. Using a file-system-based approach, where it is mounted to your instance, is not the best because the connection to S3 goes down more often than you would like it to. I know, I have tried.

But generally, your approach to "catch" the file on your instance and then push it to S3 is sound. You will need to be careful to not save a reference to the file in your database until it reaches S3, however, otherwise you will have scalability issues if you choose to run multiple instances behind a Load Balancer.

As to the transcoding, the Amazon Transcoding service is pretty new, so I don't have experience with it, but I can highly recommend Zencoder. It, incidentally, runs on AWS and is fast and cheap, and does it the way you are expecting AET to work: you give it an S3 file, and you tell it where to write the transcoded variants, thumbnails, etc.

Hope this helps.

OTHER TIPS

You don't need EBS for this.

You can write the uploaded data either to the EC2 ephemeral drive and then push it to S3 or you can stream the data directly into S3.

If you go with the ephemeral drive approach you risk data loss if your virt dies, but overall the simplicity of fewer moving parts is probably worth it.

Streaming data directly into S3 is what I would try to do. It removes the entire step of writing to a disk (be it local, or EBS) which simplifies your app.

Were I to design a system like this, it would be:

  1. List item
  2. Elastic Beanstalk scaling out the front-end
  3. Data streamed directly to S3 by your application
  4. RDS or DynamoDB table tracking what's complete and what's in progress.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top