How should I use the proper charset for my vcard?

https://stackoverflow.com/questions/20939674

24-09-2022
|

Question

i had an issue with my smartphone so i needed to export my adress book check and modify each phone number. I have some contacts with spanish characters (á ñ ü) and when I try to look at my new vcf homemade file Thunderbird doesn't recognices. I've read some questions related but I don't figure where is the point to me.

This is the schema:

I take a vcf from phone.
I open it to Thunderbird and fix the wrong information.
I export to csv file.
I make my code to get the csv to a vcard(vcf) again

So:

in old vcard there are some fields that:

NOTE;CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:=6E=6F1=61=74=61=63=69=
in csv file all the characters and info are right.
in new vcard:
1. opened in blocknotes: it shows the charset tag and shows legible string
2. opened in thunderbird's adressbook: doesn't show fields with charset issue.

This is my code opening and parsing the info:

def hasRareChar(string):
'''
Checking if strange characters are there.
'''
    c = False
    i = 0
    while True:
        if i == len(string): break
        if string[i] in 'ñáéíóúÁÉÍÓÚäëïöüÄËÏÖÜ':
            c = True
            break
        i += 1
    return c

def codeTag(string):
'''
adds the charset thing 
'''
    return string[:-1] + ';CHARSET=UTF-8;ENCODING=QUOTED-PRINTABLE:'


def parseCsvVcard(cab, linea):
    '''
    sets string info into dict structure,
    csv to vcard: I need to re-order and re-name the fields.
    '''
    # dict splitting info
    d = {}
    for x, y in zip(cab, linea.split(',')):
        # print x + ':' + y
        d[x] = y
    # ------------------------------------------------
    # dict for VCARD format.
    d2 = {}
    # NAME COMPOSITION - using hasRareChar(str) codeTag(str)
    '''
    check = ['First Name' in d.keys(),'Last Name' in d.keys(),'Display Name' in d.keys(),_
             hasRareChar(d['First Name']),hasRareChar(d['Last Name']),hasRareChar(d['Display Name'])]
    tags = ['','','','N:','N:','FN:']
    for index, i in enumerate(check[3:]):
        if i: tags[index+3] = codetag(tags[index+3])
    tags = ['','','',if check[3]: '',,]
    '''
    # First and Last Names --------
    codeNames = hasRareChar(d['First Name'] + d['Last Name'])
    strNames = d['Last Name'] + ';' + d['First Name'] + ';;;'
    if not codeNames:
        d2['N:'] = strNames
    else: d2[codeTag('N:')] = strNames

    # DISPLAY NAME ----------------
    if d['Display Name'] != '' and not hasRareChar(d['Display Name']):
        d2['FN:'] = d['Display Name']
    elif d['Display Name'] != '':
        d2[codeTag('FN:')] = d['Display Name']
    else:
        if not codeNames:
            d2['N:'] = d['First Name'],d['Last Name'] + ";"
        else: d2[codeTag('FN:')] = d['First Name'],d['Last Name'] + ";"
    # -------IF TOWER:-----------------------------------------
    for i in d: # PARA EL RESTO DE CAMPOS NO VACIOS
        if i not in ['Display Name', 'First Name', 'Last Name'] and d[i] != '':
            if 'Primary Email' == i : #detecto que campo es
                tag = 'EMAIL;HOME:'
            if 'Secondary Email' == i:
                tag = 'EMAIL;WORK:'
            if 'Mobile Number' == i:
                tag = 'TEL;CELL:'
            if 'Home Phone' == i:
                tag = 'TEL;HOME:'
            if 'Work Phone' == i:
                tag = 'TEL;WORK:'
            if 'Web Page 1' == i:
                tag = 'URL:'
            if 'Notes' in i:
                tag = 'NOTE:'
            if hasRareChar(d[i]): # compruebo si debo codificar el TAG
                tag = codeTag(tag)
                d2[tag] = d[i].decode() # WHAT SHOULD COME HERE ???????????
            else: d2[tag] = d[i] #asigno
    return d2
# ----------- MAIN CODE DOWN HERE ---------------------
#  -- csv file opened to a string variable ------------
csvFile = open("contactList.CSV",'r')
readed = csvFile.read()
csvFile.close()
lines = readed.split('\n') # split lines
# separated header and info rows.
head = lines[0].split(',')
# la informacion
lines = lines[1:]
# ----------------------------------------
# new text construction with parse function.
texto = ''
for x in lines[:-1]: # last is a blank record
    y = parseCsvVcard(head,x)
    #print y
    texto += 'BEGIN:VCARD\nVERSION:2.1\n'
    #iterando cada campo se escribe
    for index in y:
        texto += str(index)+str(y[index])+'\n'
    texto += 'END:VCARD\n'
# ----------------------------------------
# WRITE TO NEW VCARD FILE
with open("please RENAME.vcf", 'w') as vcard:
    vcard.write(texto)

print '----- File Created: please RENAME.vcf -----'
print '----- Check it for proper information.'

It seems that I maintain the charset tag reference, python takes the correct characters (python makes most things correct :) but I don't make any transformation in the string variables. Beware the question inside the code, I readed some other posts and maybe the point is there).

Solution

Well I managed it:

I make this every string I need to ensure it will be displayed:

strNames.decode('ISO-8859-1').encode('utf-8')

so this is added to the tag in vcard (now the function in the code above has changed:

;CHARSET=UTF-8:

Now in notepad there are this kind of chars:

Ã± Ã©

but

Thunderbird shows everything properly.

¡¡ And it works too in my Android Phone !!

The code above only needs the charset .decode().encode() fixes and the IF TOWER modification to work. Also check the file names to use your own. I post the code which worked for me here:

csvToVCARD.py

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow