Question

I am trying to read attached xlsx (Click here to download ) file using python openpyxl. However, workbook cannot be loaded. Here is my attempt to open xlsx file in python -

>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename = "test.xlsx")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 136, in load_workbook
    _load_workbook(wb, archive, filename, use_iterators, keep_vba)
  File "C:\Python27\lib\site-packages\openpyxl\reader\excel.py", line 198, in _load_workbook
    keep_vba=keep_vba)
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 332, in read_worksheet
    fast_parse(ws, xml_source, string_table, style_table, color_index)
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 320, in fast_parse
    parser.parse()
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 137, in parse
    dispatcher[tag_name](element)
  File "C:\Python27\lib\site-packages\openpyxl\reader\worksheet.py", line 176, in parse_merge
    self.ws.merge_cells(mergeCell.get('ref'))
  File "C:\Python27\lib\site-packages\openpyxl\worksheet.py", line 815, in merge_cells
    raise InsufficientCoordinatesException(msg)
openpyxl.shared.exc.InsufficientCoordinatesException: Range must be a cell range (e.g. A1:E1)
Was it helpful?

Solution 3

The problem was that some merged cells were, in fact, merged with themselves. openpyxl expected a merged cell reference always to be a range of cells. A fix for the problem which ignores meaningless merges has been added to the 2.0 branch.

OTHER TIPS

It appears that your .xlsx file is damaged or permanently corrupted. The reasons could be many. One of them could be that you might have renamed the extension of the file to .xlsx which would invalidate the file. To confirm this beahviour, please try to open this file in Microsoft Excel.

I tried reading the file through, openpyxl, xlrd and pandas but none of them worked.

>>> import xlrd
>>> xlrd.open_workbook('test.xlsx')
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '<html> <'


>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename = "test.xlsx")
InvalidFileException: File is not a zip file

>>> import pandas 
>>> pandas.ExcelFile('test.xlsx')
InvalidFileException: File is not a zip file

I ran into this issue trying to open every file in a directory ending in *.xlsx . I later found the file that caused the error was named ~$filename.xlsx . I'm guessing that Microsoft indicates that a file is currently opened by creating a file with the same name, prepended with the ~$. Once I closed the file, everything worked as expected.

I like openpyxl and use it for creating xlsx documents. It could be a bug or a missing compatibility with excel feature that takes place in your specific document. I would report it to the openpyxl community

OK Guys.. I have reported this bug to openpyxl developers and they have provided a quick fix on this. Here is the complete thread.

I did never try openpyxl but I use xlrd for reading excel files (.xls and .xlsx). its work great.

see the examples and documentation at http://www.python-excel.org/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top