Can I still use StringIO when the containing Writer() closes it?
Question
I am using the Python avro library. I want to send an avro file over http, but I don't particularly want to save that file to disk first, so I thought I'd use StringIO to house the file contents until I'm ready to send. But avro.datafile.DataFileWriter thoughtfully takes care of closing the file handle for me, which makes it difficult for me to get the data back out of the StringIO. Here's what I mean in code:
from StringIO import StringIO
from avro.datafile import DataFileWriter
from avro import schema, io
from testdata import BEARER, PUBLISHURL, SERVER, TESTDATA
from httplib2 import Http
HTTP = Http()
##
# Write the message data to a StringIO
#
# @return StringIO
#
def write_data():
message = TESTDATA
schema = getSchema()
datum_writer = io.DatumWriter(schema)
data = StringIO()
with DataFileWriter(data, datum_writer, writers_schema=schema, codec='deflate') as datafile_writer:
datafile_writer.append(message)
# If I return data inside the with block, the DFW buffer isn't flushed
# and I may get an incomplete file
return data
##
# Make the POST and dump its response
#
def main():
headers = {
"Content-Type": "avro/binary",
"Authorization": "Bearer %s" % BEARER,
"X-XC-SCHEMA-VERSION": "1.0.0",
}
body = write_data().getvalue() # AttributeError: StringIO instance has no attribute 'buf'
# the StringIO instance returned by write_data() is already closed. :(
resp, content = HTTP.request(
uri=PUBLISHURL,
method='POST',
body=body,
headers=headers,
)
print resp, content
I do have some workarounds I can use, but none of them are terribly elegant. Is there any way to get the data from the StringIO after it's closed?
Solution
Not really.
The docs are very clear on this:
StringIO.close()
Free the memory buffer. Attempting to do further operations with a closed StringIO object will raise a ValueError.
The cleanest way of doing it would be to inherit from StringIO and override the close
method to do nothing:
class MyStringIO(StringIO):
def close(self):
pass
def _close(self):
super(MyStringIO, self).close()
And call _close()
when you're ready.
OTHER TIPS
I was looking to do exactly the same thing, the DataFileWriter has a flush method, so you should be able to flush after the call to append and then return the data. Seems a little more elegant to me than deriving a class from StringIO.