Question

I am trying to write a Python script that will read data from an Excel document and then write it to a single table in an Oracle database. I am currently using xlrd to read from the Excel doc and cx_Oracle to insert the data into the database.

I had previously done this using a nested for-loop structure to go through every column in the Excel doc, store each column value in a variable, insert the values into the table, and then do this for every row. However, this was rather inefficient for a few thousand rows of data, and I am looking to do it using the executemany() statement with cx_Oracle

I am currently using this code to load the data into a list of lists and then calling the executemany() command:

rows = [] 
for rownum in range (sh1.nrows):
        column_value = sh1.row_values(rownum)
        EMPLOYEE = column_value[1]
        ITEM_DATE = column_value[2]
        HOURS = column_value[3]
        row = [EMPLOYEE, ITEM_DATE, HOURS]
        rows.append(row)

query ="""INSERT INTO TABLE1 (EMPLOYEE, ITEM_DATE, HOURS) VALUES (:1, :2, :2)"""
# executemany by passing list rows with tuples (EMPLOYEE, ITEM_DATE, HOURS)
cursor.executemany(query, rows)

The rows list looks like this:

[[u'Employee 1', 10000.0, 8.0], [u'Employee 1', 10001.0, 8.0], [u'Employee 1', 10002.0, 8.0]....]
# I have disguised the names and numbers here

However, I am getting a type error when the executemany() statement is being executed:

cursor.executemany(query, rows)
TypeError: expecting string, unicode or buffer object

The query executes perfectly fine when performing an cursor.execute(query, row) (doing a single insert on the last row of data read), so I presume there is something wrong with the way the list of parameters are formatted, not the query string. However, my parameters appear correctly formatted according to this tutorial Can anybody help me understand why my code is not working?

Update: So I tried manually inserting some data into the rows variable to see if my formatting was off, and took off the unicode character:

rows = [['Employee 1', 10000.0, 8.0], ['Employee 1', 10001.0, 8.0], ['Employee 1', 10002.0, 8.0]]

Inserting these three entries worked just fine, so I am currently investigating the unicode character as the cause of the problem. Any help would still be appreciated.

Was it helpful?

Solution

It looks like the unicode was the entire problem. I used print type(EMPLOYEE) in the for-loop and confirmed that the employee name was being encoded with unicode whereas the database was simply expecting an ascii string. I converted the value using the str() function and everything worked correctly. The only downside to this is that the str() command will throw an error if there are actually unicode characters in the employee name, so I will soon be looking into actually converting the string to handle unicode characters.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top