Question

Using Python, I need to be able to do the following operations to a workbook for excel 2007:

1.delete colums

I am looking into xlrd; however.

Can anyone please tell me how could do this?

Was it helpful?

Solution

If you're working in Python 3.x, you'll find a lot of trouble using the xlrd/xlwt/xlutils family, as they're modules for Python 2.

You might consider openpyxl for working with Excel 2007 .xlsx files in Python 3.

If you just need to shift values over (disregarding formatting, etc) here's one way to do it. You can build on this:

from openpyxl import load_workbook
from openpyxl.cell import column_index_from_string as col_index
from openpyxl.cell import get_column_letter as col_letter

def del_col(s, col, cmax=None, rmax=None):
    col_num = col_index(col) - 1
    cols = s.columns
    if isinstance(cmax, str):
        cmax = col_index(cmax)
    cmax = cmax or s.get_highest_column()
    rmax = rmax or s.get_highest_row()
    for c in range(col_num, cmax + 1):
        # print("working on column %i" % c)
        for r in range(0, rmax):
            cols[c][r].value = cols[c+1][r].value
    for r in range(0, rmax):
        cols[c+1][r].value = ''

    return s

if __name__ == '__main__':
    wb = load_workbook('input_book.xlsx')
    ws = wb.active
    # or by name: ws = wb['SheetName']
    col = 'D'
    del_col(ws, col)
    wb.save('output_book.xlsx')

OTHER TIPS

This deletes a column from an xlsx worksheet using openpyxl 2.3.3. You can specify either a column number or letter:

import openpyxl.cell

def delete_column(ws, delete_column):
    if isinstance(delete_column, str):
        delete_column = openpyxl.cell.column_index_from_string(delete_column)
    assert delete_column >= 1, "Column numbers must be 1 or greater"

    for column in range(delete_column, ws.max_column + 1):
        for row in range(1, ws.max_row + 1):
            ws.cell(row=row, column=column).value = \
                    ws.cell(row=row, column=column+1).value

On its final iteration this copies None values from column ws.max_column+1 to column ws.max_column, wiping out the values that used to be there. While the values in the cells are correct, unfortunately ws.max_column does not decrement.

In other answers there is talk of using Worksheet.garbage_collect() to reset ws.max_column, but I could only find the private method Worksheet._garbage_collect(), so I did not use it.

UPDATE: In the end I found that deleting a lot of columns was inefficient. Hiding them is a superior solution. This preserves formatting and even more importantly, maintains the integrity of formulae that reference the hidden cells:

def hide_column(ws, column_id):
    if isinstance(column_id, int):
        assert column_id >= 1, "Column numbers must be 1 or greater"
        column_id = openpyxl.cell.get_column_letter(column_id)
    column_dimension = ws.column_dimensions[column_id]
    column_dimension.hidden = True
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top