pyodbc + MS Access(*.mdb) + UnicodeDecodeError

https://stackoverflow.com/questions/21158460

28-09-2022
|

Question

I connect to MS Access database (mdb file) via pyodbc.

Some data in this db have polish char like (łóźćśę and so on). When I fetch some data, polish chars are replaced by strange chars (³, ê). I try to decode into utf8, cp1250, cp1252, latin1, latin2 but it does't solve my problem (still char are not correct).

Can anyone helps me?

ps. for now my solution is data = data.replace('\xc2\xb3', 'ł') but it is ugly as hell.

Solution

I have an .mdb file with some sample data in a table named [vocabulary]. When I launch Access and open the table in Datasheet View it looks like this:

ID  word      language  english_equiv
--  --------  --------  -------------
 5  żaglówka  Polish    sailboat

The following Python 2.7.5 code

# -*- coding: utf-8 -*-
import pyodbc

db = pyodbc.connect(
    r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};' +
    r'DBQ=C:\__tmp\unicodeMdbTest.mdb')

cursor1 = db.execute('SELECT [word] FROM [vocabulary] WHERE [ID]=5')

while 1:
    row = cursor1.fetchone()
    if not row:
        break
    print row.word
db.close()

successfully prints the following in the IDLE shell

żaglówka

Note the file encoding declaration on the first line of the .py file.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow