I've found the issue. Turns out my application code is fine, and the problem lies with Waitress and nginx:
Waitress, the default web server Pyramid uses, buffers all output in 18000-byte chunks (see this issue for details).
The source of the problem was hidden from me by nginx, the web server I put in front of my Pyramid application, which also buffers responses.
(1) can be solved by either:
Configuring waitress with
send_bytes = 1
in your .ini file. This fixes the streaming problem, but makes your entire app super slow. As @Zitrax mentioned, you can recover some speed with higher values, but any value higher than 1 risks messages getting stuck in the buffer.Switching to gunicorn. I don't know whether gunicorn just uses a smaller buffer, or if it behaves better with
app_iter
, but it worked, and kept my app fast.
(2) can be solved by configuring nginx to disable buffering for your stream routes.
You need to set proxy_buffering off
in your nginx conf. This setting applies to sites hosted via proxy_pass
. If you're not using proxy_pass
you may need a different setting.
You may configure nginx to dynamically enable/disable buffering for each response based on request headers, as shown in this question on the topic (a good solution for EventSource/Server-Sent Events)
You may alternatively configure this in a
location
block in your nginx conf. This is good if you're using something besides EventSource and you're not expecting to receive a particular header, or if you are using EventSource, but want to debug the response from a plain browser tab, where you can't send theAccept
header in your request.