Question

How would I watch an S3 bucket and have the files that were uploaded to say a folder called in be processed and then placed in a folder called done?

I am converting PDF files into PNGs for a client and would love to not have a processing node sitting idle 90% of the time. It would be great if there was a way to kick off an EC2 instance once a file was uploaded, do the processing, and then shut the instance down.

Any guidance in this would be greatly appreciated.

Was it helpful?

Solution

My suggestion would be this:

  • Upload files to S3, and then immediately add a message/item to an SQS Queue that has the file name that needs to be processed.
  • Setup AWS Autoscale policy that launces ec2 instances depending on the size of the Queue (the number of files waiting to be processed).
  • After the launched ec2 instance processes a file, it deletes the item from the SQS queue.
  • If there is no more items to be processed the EC2 instances shuts itself down.

Something to keep in mind however, when you spin up an instance you pay for the whole hour, so you don't want to spin up an instance and do a few minutes worth of work and then shutdown or sit idle - better to wait until you have enough work ready to process for close to an hour if possible i.e. when the queue grows to a certain size. - (spin instances up fast, but shut them down slowly)

Thats the high-level idea...adapt as you see fit.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top