Question

The following code, executed in python 2.7.2 on windows, only reads in a fraction of the underlying file:

import os

in_file = open(os.path.join(settings.BASEPATH,'CompanyName.docx'))
incontent = in_file.read()
in_file.close()

while this code works just fine:

import io
import os

in_file = io.FileIO(os.path.join(settings.BASEPATH,'CompanyName.docx'))
incontent = in_file.read()
in_file.close()

Why the difference? From my reading of the docs, they should perform identically.

Was it helpful?

Solution

You need to open the file in binary mode, or the read() will stop at the first EOF character it finds. And a docx is a ZIP file which is guaranteed to contain such a character somewhere.

Try

in_file = open(os.path.join(settings.BASEPATH,'CompanyName.docx'), "rb")

FileIO reads raw bytestreams and those are "binary" by default.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top