How do I make HTTP requests in Rails while still servicing many requests per minute?

https://stackoverflow.com//questions/22029333

21-12-2019
|

Question

I'm trying to scale up an app server to process over 20,000 requests per minute.

When I stress-test the requests, most requests are easily handling 20,000 RPM or more.

But, requests that need to make an external HTTP request (eg, Facebook Login) bring the server down to a crawl (3,000 RPM).

I conceptually understand the limitations of my current environment -- 3 load-balanced servers with 4 unicorn workers per server can only handle 12 requests at a time, even if all of them are waiting on HTTP requests.

What are my options for scaling this better? I'd like to handle many more connections at once.

Possible solutions as I understand it:

Brute force: use more unicorn workers (ie, more RAM) and more servers.
Push all the blocking operations into background/worker processes to free up the web processes. Clients will need to poll periodically to find when their request has completed.
Move to Puma instead of Unicorn (and probably to Rubinius from MRI), so that I can use threads instead of processes -- which may(??) improve memory usage per connection, and therefore allow the number of workers to be increased.

Fundamentally, what I'm looking for is: Is there a better way to increase the number of blocked/queued requests a single worker can handle so that I can increase the number of connections per server?

For example, I've heard discussion of using Thin with EventMachine. Does this open up the possibility of a Rails worker that can put down the web request it's currently working on (because that one is waiting on an external server) and then picks up another request while it's waiting? If so, is this a worthwhile avenue to pursue for performance compared with Unicorn and Puma? (Does it strongly depend on the runtime activities of the app?)

Solution

Unicorn is a single-threaded, multi-process synchronous app server. It's not a good match for this kind of processing.

It sounds like your application is I/O bound. This argues for an event-oriented daemon to process your requests.

I'd recommend trying EventMachine and the em-http-request and em-http-server.

This will allow you to service both incoming requests to the http server and outgoing HTTP service calls asynchronously.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow