Question

I have two text files:

file1.txt:

a,1
b,3
c,5
d,-4

and file2.txt:

sample1
a,12 
b,10
c,4
d,6

sample2
a,5 
b,8
c,6
d,12

sample3
a,3 
b,6
c,9
d,10

what I want to do is to subtract a value for a given letter in file1.txt from the corresponding letter in all the samples in file2.txt and create multiple files so the output looks like:

First file for sample1, sample1.txt

sample1.txt
a,11 # 12-1 as 1 from file1.txt was subtracted from 12 in file2.txt
b,7 # 10-3
c,-1 # 4-5
d,10 # 6-(-4)

and then separate file for sample2, sample2.txt:

sample2.txt
a,4 # 5-1 as 1 from file1.txt was subtracted from 5 in file2.txt
b,5 # 8-3
c,1 # 6-5
d,16 # 12-(-4)

and the same for sample3.

I tried looping over the file2.txt, but as my original file2.txt has over 1000 samples it takes a long time, is there a quicker pythonic way to do so?

Cheers, Kate

Was it helpful?

Solution

interesting! Let's take a look.

The design is pretty simple. Read the file into a dictionary and perform manipulation on the dict, then write out the files.

with open('file1.txt') as in_:
    mapping = {}
    for line in in_:
        key,value = line.strip().split(',')
        mapping[key] = int(value)

mapping is now {"a":1, "b":3, "c":5, "d":-4} Let's read in our files.

values = {}
with open('file2.txt') as in_:
    for _ in range(3):
        # This is ugly, but it's a quick hack. I'd improve it later.
        cur_dict = next(in_).strip()
        values[cur_dict] = {}
        for __ in range(4):
            key, value = next(in_).strip().split(',')
            values[cur_dict][key] = int(value)

Sheesh that's probably the ugliest code I've ever written, but values is now {"sample1": {"a":12, "b":10, "c":4, "d":6}, "sample2": ...}

Now for the manipulation. This is actually easy. Let's tack file write onto it, since this step is rather elementary

for dataset in values:
    for key, value in mapping.items():
        values[dataset][key] += value
    with open(dataset + ".txt") as out:
        out.write(dataset)
        for key,value in values[dataset]:
            out.write("{},{}\n".format(key,value))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top