It's failing on line 13 of the vCard because the ADR
property is incorrectly marked as being encoded in the "quoted-printable" encoding. The ü
character should be encoded as =FC
, which is why vobject
is throwing the error.
python, vobject, encoding, vcards
Question
I am using vobject
in python. I am attempting to parse the vcard
located here:
http://www.mayerbrown.com/people/vCard.aspx?Attorney=1150
to do this, I do the following:
import urllib
import vobject
vcard = urllib.urlopen("http://www.mayerbrown.com/people/vCard.aspx?Attorney=1150").read()
vcard_object = vobject.readOne(vcard)
Whenever I do this, I get the following error:
Traceback (most recent call last):
File "<pyshell#86>", line 1, in <module>
vobject.readOne(urllib.urlopen("http://www.mayerbrown.com/people/vCard.aspx?Attorney=1150").read())
File "C:\Python27\lib\site-packages\vobject-0.8.1c-py2.7.egg\vobject\base.py", line 1078, in readOne
ignoreUnreadable, allowQP).next()
File "C:\Python27\lib\site-packages\vobject-0.8.1c-py2.7.egg\vobject\base.py", line 1031, in readComponents
vline = textLineToContentLine(line, n)
File "C:\Python27\lib\site-packages\vobject-0.8.1c-py2.7.egg\vobject\base.py", line 888, in textLineToContentLine
return ContentLine(*parseLine(text, n), **{'encoded':True, 'lineNumber' : n})
File "C:\Python27\lib\site-packages\vobject-0.8.1c-py2.7.egg\vobject\base.py", line 262, in __init__
self.value = str(self.value).decode('quoted-printable')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 29: ordinal not in range(128)
I have tried a number of other variations on this, such as converting vcard
into unicode
, using various encodings,etc. But I always get the same, or a very similar, error message.
Any ideas on how to fix this?
La solution
Autres conseils
File is downloaded as UTF-8 (i think) encoded string, but library tries to interpret it as ASCII.
Try adding following line after urlopen:
vcard = vcard.decode('utf-8')
vobject
library readOne
method is pretty awkward.
To avoid problems I decided to persist in my database the vcards in form of quoted-printable data, which the one likes.
assuming some_vcard
is string with UTF-8 encoding
quopried_vcard = quopri.encodestring(some_vcard)
and the quopried_vcard
gets persisted, and when needed just:
vobj = vobject.readOne(quopried_vcard)
and then to get back decoded data, e.g for fn
field in vcard:
quopri.decodestring(vobj.fn.value)
Maybe somebody can handle UTF-8 with readOne better. If yes I would love to see it.