Code logic for image processing with Amazon SQS

https://stackoverflow.com//questions/22015947

21-12-2019
|

Question

I am implementing Amazon SQS for image processing. I wanted to know how can I create a logic in C# that will allow me to only process X amount of images and read more messages to feel the X capacity (let's say 20 images).

From my understanding, I follow this logic for image processing:

Image uploaded to the server
I upload the image to S3 for temporary storage with unique string
I create a SQS message with the image location to be read later on (plus more data that I need to be stored)
I read the messages on the server
If there is a message (up to X, let's say 20), run the code that process the image
The code: download the temp image from S3, process the image, upload it back to S3 processed

What code logic is used in a webservice after the image was uploaded by the user, in order to continuously check for messages and only process X amount of images at a given time. Making sure that if for example one image has completed the process, we can continue to process another one.

e.g. I am processing 20 images, one finished processing, not I can process one more image.

i thought about running the recieveMessageResponse each time a single image processing is complete and keep a counter of the active processes in memory (static variable). So I start with counter = 0, when let's say three images were uploaded, the counter equals to 3. When it reaches 20, I won't process any images, when one finished the counter = counter - 1. I have a if statement that checked the counter, if it's less than X (let's say 20 quota), I will process X - counter.

In my application: the user uploads the image to the EC2 server using a webservice. I have a WCF service that does the processing which is called from the webservice itself. I have a callback function when the image processing is complete.

What is the best logic for it?

Solution

3 decent options come to mind. I will list them in my order of preference (best to worst)

1) Use a semaphore with an initial count of 20. The thread which reads from SQS can acquire from the semaphore before it calls SQS. The thread which does the image processing releases from the semaphore when the processing is complete (make sure to release in a finally clause in case there is an exception). This is ideal because it allows any number of worker threads and any number of SQS polling threads. It could also be implemented to allow getting multiple messages from a single receive message call.

2) have 20 SQSL polling threads which synchronously do the image processing on the same thread. This doesn't have great separation of concerns, but it is the smallest number of threads, and it is by far the easiest to implement. The biggest down side of this is that you cannot receive multiple messages during a single receive message call. Assuming the image processing is orders of magnitude more resource intensive than an SQS poll, this isn't an issue. But if SQS polling represents a significant portion of your CPU utilization, this can be a problem.

3) Have 1 thread read from SQS and hand off to 20 worker threads. Make the handoff blocking so that the reader thread will never pull down more than 1 message more than can be processed. This is problematic to scale up the reader threads, and it isn't ideal since you have 1 message received waiting for a worker thread to handle.

Note: The more I think about it, #2 seems to be the best option for you. The simplicity of the design should outweigh the benefit of allowing batch receiving in #1. If you need it later you can always change the design. Even though the same thread is doing the processing you should be sure to design your code so that image processing and SQS polling are decoupled.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow