Question

I'm making a program using the website http://placekitten.com, but I've run into a bit of a problem. Using this:

im = urllib2.urlopen(url).read()
f = open('kitten.jpeg', 'w')
f.write(im)
f.close()

The image turns out distorted with mismatched colors, like this:

http://imgur.com/zVg64Kn.jpeg

I was wondering if there was an alternative to extracting images with urllib2. If anyone could help, that would be great!

Was it helpful?

Solution

You need to open the file in binary mode:

f = open('kitten.jpeg', 'wb')

Python will otherwise translate line endings to the native platform form, a transformation that breaks binary data, as documented for the open() function:

The default is to use text mode, which may convert '\n' characters to a platform-specific representation on writing and back on reading. Thus, when opening a binary file, you should append 'b' to the mode value to open the file in binary mode, which will improve portability.

When copying data from a URL to a file, you could use shutil.copyfileob() to handle streaming efficiently:

from shutil import copyfileobj

im = urllib2.urlopen(url)
with open('kitten.jpeg', 'wb') as out:
    copyfileobj(im, out)

This will read data in chunks, avoiding filling memory with large blobs of binary data. The with statement handles closing the file object for you.

OTHER TIPS

Change

f = open('kitten.jpeg', 'w')

to read

f = open('kitten.jpeg', 'wb')

See http://docs.python.org/2/library/functions.html#open for more information. What's happening is that the newlines in the jpeg are getting modified in the process of saving, and opening as a binary file will prevent this.

If you're using Windows, you have to open the file in binary mode:

f = open('kitten.jpeg', 'wb')

Or more Pythonically:

import urllib2

url = 'http://placekitten.com.s3.amazonaws.com/homepage-samples/200/140.jpg'
image = urllib2.urlopen(url).read()

with open('kitten.jpg', 'wb') as handle:
    handle.write(image)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top