Python .xlsx (Office OpenXML) reader as simple as csv module?
Question
I know some Python xlsx readers are emerging, but from what I've seen they don't seem nearly as intuitive as the built-in csv
module.
What I want is a module that can do something like this:
reader = xlsx.reader(open('/path/to/file'))
for sheet in reader:
print 'In %s we have the following employees:' % (sheet.name)
for row in sheet:
print '%s, %s years old' % (row['Employee'], row['Age'])
Is there such a reader?
Solution
xlrd has xlsx handling for basic data extraction, using the same APIs as for xls, in alpha test at the moment. Send me private e-mail if interested.
OTHER TIPS
Well, maybe not for the xlsx format, but certainly for xls. Grab xlrd from here:
Here's some example code to get a feel for how easy it is to work with:
import xlrd
EMPLOYEE_CELL = 5
AGE_CELL = 6
reader = xlrd.open_workbook('C:\\path\\to\\excel_file.xls')
for sheet in reader.sheets():
print 'In %s we have the following employees:' % (sheet.name)
for r in xrange(sheet.nrows):
row_cells = sheet.row(r)
print '%s, %s years old' % (row_cells[EMPLOYEE_CELL].value, row_cells[AGE_CELL].value)
If you can save the documents as an xls, you should be good. I didn't try out the code above, but that's pretty close if not 100% correct. Try it out and let me know.
EDIT:
I'm guessing you're trying to do this on a non-windows machine. You may be able to use something like PyODConverter to convert the document from xlsx to xls, and then run against the converted file. Something like this:
user@server:~# python DocumentConverter.py excel_file.xlxs excel_file.xls
user@server:~# python script_with_code_above.py
Once again, haven't tested it out but hopefully it'll work for your needs.