UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-11: ordinal not in range(128)

StackOverflow https://stackoverflow.com/questions/21687615

Domanda

I am trying to make a script. Script that would search trough .xls file and print out rows for which ones the conditions are true. I have no problem with that. I do, however have problem when I need to write that row in a file.

this is the code:

import xlrd    
import string
dataFile = open('Napadaci.txt', 'w')
workbook = xlrd.open_workbook('TBS_58_pos10_stars75_2014-02-09.xls')
worksheet = workbook.sheet_by_name('Sheet1')
num_rows = worksheet.nrows - 1
num_cells = worksheet.ncols - 1
curr_row = -1
b = 0
new_cell_value = ""
while curr_row < num_rows:
      curr_row += 1
  row = worksheet.row(curr_row)
  curr_cell = 4
  cell_value = worksheet.cell_value(curr_row, curr_cell)
  if cell_value < 17.0:
    curr_cell = 5
    cell_value = worksheet.cell_value(curr_row, curr_cell)
    if cell_value == 95.0:
        curr_cell = 9
        cell_value = worksheet.cell_value(curr_row, curr_cell)
        if cell_value == "Tehnical" or cell_value == "Quick" or cell_value == "Head" or cell_value == "Unpredictable":  
            b += 1
            dataFile.write(str(b)+'\n')
            curr_cell = -1
            while (curr_cell + 1) < num_cells:
                curr_cell += 1
                cell_value = worksheet.cell_value(curr_row, curr_cell)
                new_cell_value=cell_value
                if isinstance(cell_value, str):
                    new_cell_value = cell_value.encode('ascii','ignore')
                dataFile.write(str(new_cell_value)+'\n')
        dataFile.write(str('Trazim sljedeceg')+'\n'+'\n'+'\n'+'\n')

So, a bunch of ifs to make sure the row is exactly right. But, when I try to run it, i get error: UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-11: ordinal not in range(128) I googled and found out that is because in .xls file I have characters like this:šć etc. I am going trough all cells one by one, and I figured that I only need to solve this in cells where I have strings, hence the very last if. I am quite sure that the

    new_cell_value = cell_value.encode('ascii','ignore')

line should fix it, but it does not happen. Please help, I don't know what am I doing wrong. If you need any more extra information. I have Python 2.7.3 and I am running ubuntu 12.04

Edit:Oh and those characthers aren't very important for me, so I am in a position to lose them if needed.

È stato utile?

Soluzione

The issue here is the if statement.

>>> uni = u"\u04533testing"
>>> print uni
ѓ3testing
>>> isinstance(uni, str)
False
>>> type(uni)
<type 'unicode'>

Because this object is not a string, its a unicode string. Therefor that encode is never being hit. You want

if isinstance(cell_value,unicode):
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top