Question

I have the following Werkzeug application for returning a file to the client:

from werkzeug.wrappers import Request, Response

@Request.application
def application(request):    
    fileObj = file(r'C:\test.pdf','rb')
    response = Response( response=fileObj.read() )
    response.headers['content-type'] = 'application/pdf'
    return response

The part I want to focus on is this one:

response = Response( response=fileObj.read() )

In this case the response takes about 500 ms (C:\test.pdf is a 4 MB file. Web server is in my local machine).

But if I rewrite that line to this:

response = Response()
response.response = fileObj

Now the response takes about 1500 ms. (3 times slower)

And if write it like this:

response = Response()
response.response = fileObj.read()

Now the response takes about 80 seconds (that's right, 80 SECONDS).

Why is there that much difference between the 3 methods?
And why is the third method sooooo slow?

Était-ce utile?

La solution 2

After some testing I think I've figure out the mistery.

@Armin already explained why this...

response = Response()
response.response = fileObj.read()

...is so slow. But that doesn't explain why this...

response = Response( response=fileObj.read() )

...is so fast. They appear to be the same thing, but obviously they are not. Otherwise there wouldn't be that tremendous difference is speed.

The key here is in this part of the docs: http://werkzeug.pocoo.org/docs/wrappers/

Response can be any kind of iterable or string. If it’s a string it’s considered being an iterable with one item which is the string passed.

i.e. when you give a string to the constructor, it's converted to an iterable with the string being it's only element. But when you do this: response.response = fileObj.read(), the string is treated as is.

So to make it behave like the constructor, you have to do this:

response.response = [ fileObj.read() ]

and now the file is sent as fast as possible.

Autres conseils

The answer to that is pretty simple:

  • x.read() <- reads the whole file into memory, inefficient
  • setting response to a file: very inefficient as the protocol for that object is an iterator. So you will send the file line by line. If it's binary you will send it with random chunk sizes even.
  • setting response to a string: bad idea. It's an iterator as mentioned before, so you are now sending each character in the string as a separate packet.

The correct solution is to wrap the file in the file wrapper provided by the WSGI server:

from werkzeug.wsgi import wrap_file
return Response(wrap_file(environ, yourfile), direct_passthrough=True)

The direct_passthrough flag is required so that the response object does not attempt to iterate over the file wrapper but leaves it untouched for the WSGI server.

I can't give you a precise answer as to why this occurs, however http://werkzeug.pocoo.org/docs/wsgi/#werkzeug.wsgi.wrap_file may help address your underling problem.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top