Question

I am using Sidekiq and Redis To Go on a production site hosted on Heroku. I am spinning up multiple Sidekiq workers to do a job for me. Out of 600 workers, I got down to about 180 workers left before my workers got "stuck". They attempt to do a job, and I get one of two errors back:

WARN: {"retry"=>true, "queue"=>"default", "class"=>"F9LoadRecordWorker", "args"=>[25126], "jid"=>"0426e1db817e27986da6b636", "enqueued_at"=>1395332988.09929, "error_message"=>"Connection reset by peer - SSL_connect", "error_class"=>"Errno::ECONNRESET", "failed_at"=>1395337905.5061884, "retry_count"=>0}

or

WARN: {"retry"=>true, "queue"=>"default", "class"=>"F9LoadRecordWorker", "args"=>[25131], "jid"=>"79601ea488efc10f1fbcc433", "enqueued_at"=>1395332988.1172419, "error_message"=>"Connection refused - connect(2)", "error_class"=>"Errno::ECONNREFUSED", "failed_at"=>1395338127.4794347, "retry_count"=>1, "retried_at"=>1395338202.905867}

So the actual errors are either Connection reset by peer - SSL_connent or Connection refused - connect(2).

What is causing this? Why would 400~ workers succeed and then the last 200~ get stuck in this loop of retrying and getting continuous errors?

Was it helpful?

Solution

RedisToGo is a shared hosting redis as a service provider. Unfortunately, it's easy to overwhelm your shared redis instance with too many client connections, causing timeouts.

You are essentially DOSing your redis host and the infrastructure of RedisToGo.

You may have to upgrade to a bigger plan or more robust hosting to support the number of connections you want.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top