Question

I'm having trouble finding documentation on the Gunicorn/Django process/thread lifecycle.

Let's say a daemon thread is spawned during the process_response() middleware hook. AFAIK this thread doesn't block the HTTP response. But, does it block the thread from which it was spawned? Does Gunicorn wait for this thread's completion to join it back to the main thread before the worker process is ready to handle another request, or will this thread be detached?

data_collection/tasks.py:

from celery import shared_task

@shared_task(ignore_result=True)
def add_event(event_name, event_body):
    ...
    client.add_event(event_name, event_body)

data_collection/middleware.py:

import threading
from data_collection.tasks import add_event

class DataCollectionMiddleware:
    def process_response(self, request, response):
        ...
        thread = threading.Thread(target=add_event.delay, args=("Page_Views", event_body))
        thread.setDaemon(True)
        thread.start()

More detail:

I've written a custom middleware class to send some data to an external queue (RabbitMQ), which is later retrieved and processed asychronously by a celery worker. I don't want this over-the-wire enqueue call to block the client's response, so I wrap that function (add_event.delay()) in a "daemon" thread (a la http://www.artfulcode.net/articles/threading-django/). This thread may potentially run for a long time, if there's a network outage and the retry policy has a long limit. In that case, would these threads block my Gunicorn worker processes?

I read this question, but I'm not sure if my thread is interfering with the "Worker's main loop": Danger to having long lasting (non-deamon) threads in a Django/Gunicorn app?

Was it helpful?

Solution

No. There is nothing special about threads spawned from Gunicorn worker main threads.

Once you spawn a thread, it will execute in parallel until completion or death. Gunicorn doesn't know about these threads spawned from worker main threads, so it doesn't try to join on them, therefore the worker main threads will not wait for child thread completion. Also, the daemon-ness of the thread has no effect; daemon simply means that the thread won't contribute to the "alive-ness" of the process, and will run until process exit, when it will be automatically killed.

If you want to wait for these threads to complete before re-using the same worker thread, you have to do that before the WSGI application (e.g. django.core.handlers.wsgi.WSGIHandler.__call__()) returns. Or write some crazy monkey-patch for Gunicorn to keep track of child threads.

TL;DR you can definitely grow threads without bound by spawning long-running child threads from worker main threads. Best to guarantee that they will finish within some time bound with timeouts.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top