Question

I have a dictionary that was previously output into the textfile, i.e. [in]:

dict_from_file = """foo\tfoobar, bar, foo foo\tFoo bar language, computing\nprog\t python, java, c, c++\t computing\nedibles\tcereal, milka, whatever\tfood"""

Originally, the key was column 1, the alias are in column 2 and column 3 is the value, I need to convert the textfiles such that i have the last column is the key and the 1st and 2nd columns are values:

[out]:

Foo bar language\tfoobar, foo, bar, foo foo
computing\tfoobar, foo, bar, foo foo, python, java, c, c++, prog
food\tcereal, milka, whatever\tedibles

The purpose of doing this is so that given the input foo foo, by getkeybyvalue() function will return ['Foo bar language','computing'].

I've been dong it as below:

from collections import defaultdict

outdict = defaultdict(list)
def getkeybyvalue(dictionary, value):
  return [i for i,j in dictionary.items() if value in j]

dict_from_file = """foo\tfoobar, bar, foo foo\tFoo bar language, computing\nprog\t python, java, c, c++\t computing\nedibles\tcereal, milka, whatever\tfood"""

for line in dict_from_file.split('\n'):
    column1, column2, column3 = line.strip().split('\t')
    #print column1, column2, column3
    for c3 in column3.split(','):
        c3 = c3.strip(', ')
        outdict[c3].append(column1)
        for c2 in column2.split(','):
            outdict[c3].append(c2.strip(' ,'))

for k in outdict:
    print k, outdict[k]

print getkeybyvalue(outdict, 'foo foo')
  1. Is there a less verbose way to do this?
  2. How else should I read the textfile such that given foo foo, my python dictionary returns ['Foo bar language','computing'] ?
Was it helpful?

Solution

In response to asking for an example, you could use list comprehensions to get technically more compact code. I don't think they add much here (although they are likely more efficient for huge data sets if we were to profile them).

They do make the code harder to read, so I'm really not a fan of that approach here for most situations.

If you feel like your solution is strung together and you can control the persistence code, store the data in a standard format.

from collections import defaultdict

outdict = defaultdict(list)
def getkeybyvalue(dictionary, value):
    return [i for i,j in dictionary.items() if value in j]

dict_from_file = """foo\tfoobar, bar, foo foo\tFoo bar language, computing\nprog\t python, java, c, c++\t computing\nedibles\tcereal, milk, whatever\tfood"""

columns = [line.strip().split('\t') for line in dict_from_file.split('\n')]
for c1, c2, c3 in columns:
    for c3_item in c3.split(','):
        outdict[c3_item.strip(', ')] += [c1] + [c.strip(' ,') for c in c2.split(',')]

for k in outdict:
    print(k, outdict[k])

print(getkeybyvalue(outdict, 'foo foo'))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top