Why does environ['wsgi.input'].read() block even though it is allowed by PEP-3333?

https://stackoverflow.com/questions/22894078

28-06-2023
|

문제

Issue

Here is a simple WSGI application that is supposed to print the Content-Length and the request body in the header.

def application(environ, start_response):
    start_response('200 OK', [('Content-Type','text/plain')])
    content_length = int(environ['CONTENT_LENGTH'])
    print('---- Begin ----')
    print('CONTENT_LENGTH:', content_length)
    print('wsgi.input:', environ['wsgi.input'].read())
    print('---- End ----')
    return [b'Foo\n']

if __name__ == '__main__':
    from wsgiref import simple_server
    server = simple_server.make_server('0.0.0.0', 8080, application)
    server.serve_forever()

When I run this application, it gets blocked at the following call: environ['wsgi.input'].read().

I run the application using Python 3 interpreter and submit HTTP post request to it using curl.

lone@debian:~$ curl --data "a=1&b=2" http://localhost:8080/

The curl command gets blocked waiting for output. The python interpreter gets blocked at the environ['wsgi.input'].read() call.

lone@debian:~$ python3 foo.py
---- Begin ----
CONTENT_LENGTH: 7

As you can see in the output above, the application() function has got blocked after printing CONTENT_LENGTH.

Workaround

I know how to work around the issue: By passing the Content-Length header value to the read() call.

Modified code to workaround the issue:

def application(environ, start_response):
    start_response('200 OK', [('Content-Type','text/plain')])
    content_length = int(environ['CONTENT_LENGTH'])
    print('---- Begin ----')
    print('CONTENT_LENGTH:', content_length)
    print('wsgi.input:', environ['wsgi.input'].read(content_length))
    print('---- End ----')
    return [b'Foo\n']

if __name__ == '__main__':
    from wsgiref import simple_server
    server = simple_server.make_server('0.0.0.0', 8080, application)
    server.serve_forever()

The curl command gets a valid HTTP response now.

lone@debian:~$ curl --data "a=1&b=2" http://localhost:8080/
Foo
lone@debian:~$

The application() function also completes its execution.

lone@debian:~$ python3 foo.py
---- Begin ----
CONTENT_LENGTH: 7
wsgi.input: b'a=1&b=2'
---- End ----
127.0.0.1 - - [06/Apr/2014 17:53:21] "POST / HTTP/1.1" 200 4

Question

Why does the environ['wsgi.input'].read() call block when read is called without any arguments?

The PEP-3333 document seems to imply it should work. Here is the relevant text.

The server is not required to read past the client's specified Content-Length, and should simulate an end-of-file condition if the application attempts to read past that point. The application should not attempt to read more data than is specified by the CONTENT_LENGTH variable.

A server should allow read() to be called without an argument, and return the remainder of the client's input stream.

I understand that the application should not attempt to read more data than is specified by the CONTENT_LENGTH variable. I am disobeying this directive. But the server should allow read() to be called without an argument and return me the entire input stream. Why isn't it doing so?

해결책

Because it only implements PEP 333 and not PEP 3333.

PEP 333 didn't have the condition about simulating end of stream by returning an empty string.

In PEP 333 you would have problems if you tried to read more than CONTENT_LENGTH if the WSGI server supported HTTP 1.1 and request pipe lining (keep alive) was being used.

I would suggest you read PEP 333 and compare the language to PEP 3333.