celery over beanstalk broker: delays in task execution

https://stackoverflow.com/questions/18963819

29-06-2022
|

Question

TL;DR:

I'm queuing up short and simple tasks to celeryd via a beanstalkd broker, using task.delay (e.g. myNotifyTask.delay() instead of myNotifyTask()). Despite the fact the delay value is supposed to be immediate, tasks take around an hour to execute (when they should take mere seconds).

From my observations, it seems the tasks are indeed received in beanstalkd, but stay in a ready state for a very long time. This happens despite setting CELERYD_CONCURRENCY = 8. When viewing beanstalkd's logs, I get errors about read(): Connection reset by peer, but the tasks do eventually execute.

Any ideas why this could happen?

Details below.

Using beanstalk version 1.4.6, celery 3.0.20.

The beanstalk log entries look like this:

/usr/bin/beanstalkd: prot.c:709 in check_err: read(): Connection reset by peer

When attempting to use celery to diagnose the problems:

> celery -b "beanstalk://beanstalk_server:11300" status
Error: No nodes replied within time constraint.

When connecting to beanstalkd via telnet, I see current-jobs-ready: 343, which suggests jobs are stuck in the ready state (not delayed). Here is the full output:

> telnet localhost 11300
stats
OK 850
---
current-jobs-urgent: 343
current-jobs-ready: 343
current-jobs-reserved: 0
current-jobs-delayed: 0
current-jobs-buried: 0
cmd-put: 2484
cmd-peek: 0
cmd-peek-ready: 7
cmd-peek-delayed: 1
cmd-peek-buried: 1
cmd-reserve: 0
cmd-reserve-with-timeout: 52941
cmd-delete: 2141
cmd-release: 0
cmd-use: 2485
cmd-watch: 42
cmd-ignore: 40
cmd-bury: 0
cmd-kick: 0
cmd-touch: 0
cmd-stats: 497655
cmd-stats-job: 2141
cmd-stats-tube: 3
cmd-list-tubes: 2
cmd-list-tube-used: 1
cmd-list-tubes-watched: 52954
cmd-pause-tube: 0
job-timeouts: 0
total-jobs: 2484
max-job-size: 65535
current-tubes: 3
current-connections: 6
current-producers: 2
current-workers: 2
current-waiting: 1
total-connections: 502958
pid: 989
version: 1.4.6
rusage-utime: 45.778861
rusage-stime: 56.595537
uptime: 2489047
binlog-oldest-index: 0
binlog-current-index: 0
binlog-max-size: 10485760

And shortly after:

stats-tube celery
OK 257
---
name: celery
current-jobs-urgent: 348
current-jobs-ready: 348
current-jobs-reserved: 0
current-jobs-delayed: 0
current-jobs-buried: 0
total-jobs: 2739
current-using: 3
current-watching: 1
current-waiting: 0
cmd-pause-tube: 0
pause: 0
pause-time-left: 0

Solution

Turns out the problem was that one celery task had a very long timeout, which caused its worker to wait for a long time. Even though concurrency was turned on, the timeouts were simply too long, and the tasks kept piling up in the beanstalk (without celery consuming them, since all the workers eventually get busy).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow