vobject read vcf files containing german umlauts

https://stackoverflow.com/questions/18718712

28-06-2022
|

Question

I use the Vobject module.
I want to read a VCF file, that contains names with german umlauts in UTF8 charset:

BEGIN:VCARD    
VERSION:2.1    
FN:Some Name    
N:Name;Some;;;    
ADR;WORK;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:;;=49=6D=20=4D=C3=BC=68=6C=65=6E=62=72=75=63=68=20=32=33;=4B=C3=B6=6E=69=67=73=77=69=6E=74=65=72;=4E=52=57;=35=35=35=35=35;    
END:VCARD

The code:

 fp = open("vcf/%s.vcf" %(name), "r")          
 content = fp.read()          
 fp.close()    

 v = vobject.readOne(content)    
 v.prettyPrint()

For example:
König is read as K?nig
Mühle is read as M?hle

The only solution that comes to my mind,
- read the file
- look for umlaut utf8 code
- replace the utf8 code
- parse the VCF content
- replace the utf8 code with umlaut utf8 code backward

But there must be a more elegant way? Could anyone point me to that direction?

Regards,
Ck

Solution

I found the solution. The problem was, that Python needed to interpret the Stringstreams as UTF-8.
with the built-in function

unicode("ÄÖÜ", "utf-8")

the umlauts get printed as expected.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow