Question

I am saving a picture file from a server a bucket in S3 in the following way:

request = urllib2.Request('http://link.to/file.jpg')
response = urllib2.urlopen(request)
jpg_data = response.read()

storage = S3BotoStorage(bucket='icanhazbukkit')
my_file = storage.open(path_to_new_file, 'w')
my_file.write(jpg_data)
my_file.close()

The file gets written, but somewhere along the way the MIME context gets lost, and the saved image will return Content-Type: binary/octet-stream and browser will try to download instead of displaying when its URL is hit.

Any way I can mitigate this?

Was it helpful?

Solution

When you do

jpg_data = response.read()

I believe boto loses the information about the file extension, which it uses to guess the mimetype. So by the time you store it with

my_file.write(jpg_data)

all boto/S3 knows is that it has some sort of binary data to write.

If you replace these lines in your program:

storage = S3BotoStorage(bucket='icanhazbukkit')
my_file = storage.open(path_to_new_file, 'w')
my_file.write(jpg_data)
my_file.close()

with

bucket = conn.create_bucket('icanhazbukkit')
k = Key(bucket)
k.name = "yourfilename"
header = {'Content-Type' : 'image/jpeg'}
k.set_contents_from_string(jpg_data, header)

You can control the Content-Type by specifying it with the header param

If you want to preserve the Content-Type from your original get, you can do this:

request = urllib2.Request('http://link.to/file.jpg')
response = urllib2.urlopen(request)
file_data = response.read()

bucket = conn.create_bucket('icanhazbukkit')
k = Key(bucket)
k.name = "yourfilename"
origType = response.info().gettype()
header = {'Content-Type' : origType}
k.set_contents_from_string(file_data, header)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top