Saving micro sign character in a mongo collection [closed]
-
27-06-2021 - |
Question
I'm working on a python script to created a mongo collection based on a MySql db. The problem is with the micro sign character:
bson.errors.InvalidStringData: strings in documents must be valid UTF-8: '\xb5g'
I tried encoding/decoding the value with different codes(utf-8, latin-1, cp1252,iso-8859-2) without success but I always get the following error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb5 in position 0: ordinal not in range(128)
This is the code to get the data from the mysql db. The database is the USDA one 0:
# -*- encoding: utf-8 -*-
import MySQLdb
mysqldb = MySQLdb.connect(DBCONF)
cursor = mysqldb.cursor()
foodid = 1001
q = (
' SELECT nut.Nutr_Val,'
' nutdef.Units,'
' nutdef.NutrDesc, nutdef.Tagname'
' FROM food_des AS f'
' JOIN nutrient AS nut ON nut.NDB_No = f.NDB_No'
' JOIN nutrient_def AS nutdef ON nutdef.Nutr_No = nut.Nutr_No'
' WHERE f.NDB_No = %s'
) % str(foodid)
self.cursor.execute(q)
The field with the micro sign character is the nutdef.Units one.
Solution
Just try decoding the characters as latin-1:
a = '\xb5g'
# '\xb5g'
print a
# ?g
b = a.decode('latin-1')
print b
# µg
b
# u'\xb5g'
Or you can fix this in your MySQL connection by telling it to use unicode on all CHAR, VARCHAR, and TEXT fields:
MySQLdb.connect(..., use_unicode=True)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow