Passing data to mod_wsgi

https://stackoverflow.com/questions/940816

06-09-2019
|

Question

In mod_wsgi I send the headers by running the function start_response(), but all the page content is passed by yield/return. Is there a way to pass the page content in a similar fashion as start_response()? Using the return.yield statement is very restrictive when it comes to working with chunked data.

E.g.

def Application():

    b = buffer()

    [... page code ...]

    while True:
        out = b.flush()    
        if out:
            yield out

class buffer:

    def __init__(self):        
        b = ['']
        l = 0

    def add(self, s):
        s = str(s)
        l += len(s)
        b.append(s)

    def flush(self):

        if self.l > 1000:
            out = ''.join(b)
            self.__init__()
            return out

I want to have the buffer outputting the content as the page loads, but only outputs the content once enough of it has piled up (in this eg. 1000 bytes).

Solution

No; But I don't think it is restrictive. Maybe you want to paste an example code where you describe your restriction and we can help.

To work with chunk data you just yield the chunks:

def application(environ, start_response):
    start_response('200 OK', [('Content-type', 'text/plain')]
    yield 'Chunk 1\n'    
    yield 'Chunk 2\n'    
    yield 'Chunk 3\n'
    for chunk in chunk_data_generator():
        yield chunk

def chunk_data_generator()
    yield 'Chunk 4\n'
    yield 'Chunk 5\n'

EDIT: Based in the comments you gave, an example of piling data up to a certain length before sending forward:

BUFFER_SIZE = 10 # 10 bytes for testing. Use something bigger
def application(environ, start_response):
    start_response('200 OK', [('Content-type', 'text/plain')]
    buffer = []
    size = 0
    for chunk in chunk_generator():
        buffer.append(chunk)
        size += len(chunk)
        if size > BUFFER_SIZE:
            for buf in buffer:
                yield buf
            buffer = []
            size = 0

def chunk_data_generator()
    yield 'Chunk 1\n'    
    yield 'Chunk 2\n'    
    yield 'Chunk 3\n'
    yield 'Chunk 4\n'
    yield 'Chunk 5\n'

OTHER TIPS

It is possible for your application to "push" data to the WSGI server:

Some existing application framework APIs support unbuffered output in a different manner than WSGI. Specifically, they provide a "write" function or method of some kind to write an unbuffered block of data, or else they provide a buffered "write" function and a "flush" mechanism to flush the buffer.

Unfortunately, such APIs cannot be implemented in terms of WSGI's "iterable" application return value, unless threads or other special mechanisms are used.

Therefore, to allow these frameworks to continue using an imperative API, WSGI includes a special write() callable, returned by the start_response callable.

New WSGI applications and frameworks should not use the write() callable if it is possible to avoid doing so.

http://www.python.org/dev/peps/pep-0333/#the-write-callable

But it isn't recommended.

Generally speaking, applications will achieve the best throughput by buffering their (modestly-sized) output and sending it all at once. This is a common approach in existing frameworks such as Zope: the output is buffered in a StringIO or similar object, then transmitted all at once, along with the response headers.

The corresponding approach in WSGI is for the application to simply return a single-element iterable (such as a list) containing the response body as a single string. This is the recommended approach for the vast majority of application functions, that render HTML pages whose text easily fits in memory.

http://www.python.org/dev/peps/pep-0333/#buffering-and-streaming

If you don't want to change your WSGI application itself to partially buffer response data before sending it, then implement a WSGI middleware that wraps your WSGI application and which performs that task.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow