Question

Saving a file in Mongodb's GridFS with pymongo results in a truncated file.

from pymongo import MongoClient
import gridfs
import os

#just to make sure we aren't crazy, check the filesize on disk:
print os.path.getsize( r'owl.jpg' )

#add the file to GridFS, per the pymongo documentation: http://api.mongodb.org/python/current/examples/gridfs.html
db = MongoClient().myDB
fs = gridfs.GridFS( db )
fileID = fs.put( open( r'owl.jpg', 'r')  )
out = fs.get(fileID)
print out.length

On Windows 7, running this program generates this output:

145047
864

On Ubuntu, running this program generates this (correct) output:

145047
145047

Unfortunately, the application I'm working on is targeting the Windows OS...

Any help would be appreciated!

so you can reproduce my example more rigorously, 'owl.jpg' was downloaded from: http://getintobirds.audubon.org/sites/default/files/photos/wildlife_barn_owl.jpg

Was it helpful?

Solution

Heh, changing

fileID = fs.put( open( r'owl.jpg', 'r')  )

to:

fileID = fs.put( open( r'owl.jpg', 'rb')  )

Fixes the behavior of the program on Windows 7. Too bad the behavior is different between OS's...

OTHER TIPS

you already got the answer, but for the curious:

http://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files

On Windows, 'b' appended to the mode opens the file in binary mode, so there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This behind-the-scenes modification to file data is fine for ASCII text files, but it’ll corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top