Question

I am basically trying to copy some specific columns from a CSV file and paste those in an existing excel file[*.xlsx] using python. Say for example, you have a CSV file like this :

 col_1   col_2   col_3  col_4
  1        2       3     4
  5        6       7     8
  9       10      11    12 

So, i wanted to copy the both col_3 and col_4 and paste those in col_8 and col_9 in an existing excel file [which is a .XLSX format]. I have tried this in various way to solve, but could not find out the exact way. i tried something like this :

with open( read_x_csv, 'rb') as f:
    reader = csv.reader(f)
    for row in reader: 
            list1 = row[13] 
            queue1.append(list1)
            list2 = row[14] 
            queue2.append(list2)
            list3 = row[15] 
            queue3.append(list3)
            list4 = row[16] 
            queue4.append(list4)

and then

 rb = open_workbook("Exact file path.....")
 wb = copy(rb)
 ws = wb.get_sheet(0) 

 row_no = 0

 for item in queue1:
    if(item != ""):
            ii = int(item)
            ws.write(row_no,12,ii) 
            row_no = row_no + 1
            #ws.write(item)
            print item
    else:

            ws.write(row_no,12,item) 
            row_no = row_no + 1

  wb.save("Output.xls") 

but problem with this solution is it does not allow me to save as *.XLSX format which is strictly required for me.

I have tried to use Openpyxl as it can handle *.XLSX format, but could not find out a way to modify the existing excel file. can anyone please help on this?

Doubt : 1) Can we really read a whole column from a CSV file and store into an array/list using python? 2) Can we modify the existing excel file which is in .XLSX format using openpyxl or any other package?

No correct solution

OTHER TIPS

You can try the following implementation

from openpyxl import load_workbook
import csv
def update_xlsx(src, dest):
    #Open an xlsx for reading
    wb = load_workbook(filename = dest)
    #Get the current Active Sheet
    ws = wb.get_active_sheet()
    #You can also select a particular sheet
    #based on sheet name
    #ws = wb.get_sheet_by_name("Sheet1")
    #Open the csv file
    with open(src) as fin:
        #read the csv
        reader = csv.reader(fin)
        #enumerate the rows, so that you can
        #get the row index for the xlsx
        for index,row in enumerate(reader):
            #Assuming space separated,
            #Split the row to cells (column)
            row = row[0].split()
            #Access the particular cell and assign
            #the value from the csv row
            ws.cell(row=index,column=7).value = row[2]
            ws.cell(row=index,column=8).value = row[3]
    #save the csb file
    wb.save(dest)
  • Can we really read a whole column from a CSV file and store into an array/list using python? No, because files are read sequentially, csv reader cannot read a column of data to a row. Instead you may read the whole content and use izip and islice to get a particular column. You can also use numpy.array

  • Can we modify the existing excel file which is in .XLSX format using openpyxl or any other package? Yes, see the example above

As it is 2021, get_sheet_by_name is deprecated and raises an DeprecationWarning with the following message: Call to deprecated function get_sheet_by_name (Use wb[sheetname]).

The following snippet can be used in order to not raise the warning.

from openpyxl import load_workbook

file_path = 'test.xlsx'

wb = load_workbook(file_path)

ws = wb['SHEET_NAME']  # or wb.active

ws['G6'] = 123

wb.save(file_path)


from openpyxl import load_workbook
# Class to manage excel data with openpyxl.

class Copy_excel:
    def __init__(self,src):
        self.wb = load_workbook(src)
        self.ws = self.wb.get_sheet_by_name("Sheet1")
        self.dest="destination.xlsx"

    # Write the value in the cell defined by row_dest+column_dest         
    def write_workbook(self,row_dest,column_dest,value):
        c = self.ws.cell(row = row_dest, column = column_dest)
        c.value = value

    # Save excel file
    def save_excel(self) :  
        self.wb.save(self.dest)

Open an existing excel file (Using load_workbook(...))

As simple as that!

from openpyxl import load_workbook
wb = load_workbook('test.xlsx')

See docs: https://openpyxl.readthedocs.io/en/stable/tutorial.html#loading-from-a-file

Append data at the end (keeping the old data)

work_sheet = wb.active # Get active sheet
work_sheet.append(['John', 'Customer', 'He likes football'])

Save modified workbook in test.xlsx

wb.save('test.xlsx')
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top