Question

I have a script I wrote that reads an XML file and outputs relevant data to a TSV file. I'm converting it to write a XLSX file with openpyxl. Whenever I save my workbook at the end of the script, it hangs for 30+ seconds while saving. I'm assuming this is due to the large amount of data I am writing ( 10144 rows to column 'BG' ). Is there any way to optimize this so that the save is faster, or to write directly to the file while I'm generating it so it doesn't have to save like this at the end?

Was it helpful?

Solution

It's hard to say what exactly is your problem, but the first thing you should try is to use openpyxl optimized writer:

from openpyxl import Workbook
wb = Workbook(optimized_write = True)

ws = wb.create_sheet()

# now we'll fill it with 10k rows x 200 columns
for irow in xrange(10000):
    ws.append(['%d' % i for i in xrange(200)])

wb.save('new_big_file.xlsx') # don't forget to save!

Also, consider switching to xlsxwriter in constant_memory mode (see docs).

Hope that helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top