Finding common IDs in different .txt files and appending additional corrisponding lines

Question 1

You can use dicts instead of sets:

fileA = open("file1.txt",'r')
fileB = open("file2.txt",'r')
output = open("results.txt",'w')

dictA = dict()
for line1 in fileA:
    listA = line1.split('\t')
    dictA[listA[1]] = listA

dictB = dict()
for line1 in fileB:
    listB = line1.split('\t')
    dictB[listB[1]] = listB

for key in set(dictA).intersection(dictB):
    output.write(dictA[key][1] + '\t' + dictA[key][2] + '\t' + dictA[key][3] + '\t' + dictA[key][4] + '\n')

Question 2

Since your first text file contains all of the "fields" for the output we can reduce the logic and number of steps slightly.

First we open the two input files and read them into lists:

with open('file1.txt', 'r') as a, open('file2.txt','r') as b:
    fileA = [l.rstrip('\n').split('\t')[1:5] for l in a.readlines()]
    fileB = [l.rstrip('\n').split('\t')[1:] for l in b.readlines()]

So now we have two lists, fileA and fileB. You'll notice the slice notation on both of them. Since fileA has all of the values you want for the output it is now ready, it just needs filtered against the second list. I've also removed the first item from both lists so we can use the EMT... values for comparison.

Now we can check if fileB contains (not in it's entirety) fileA and write the matches to the results file:

with open('results.txt','w') as o:
    for line in fileA:
        if any(line[0] in l for l in fileB):
            o.write('%s\n' % '\t'.join(line))

results.txt is once again tab-delimited with the corresponding matches:

EMT15298    GO:0003674  molecular_function  PF08268
EMT20601    GO:0005515  protein binding PF08268

Question 3

If you just want to do a "join" operation you can use unix join command specifying second column, for a tab delimited file it would be just like:

join file1.txt file2.txt -j2

You need to have the rows sorted, otherwise it will not work, however you can also use the sort command also available.

In addition, to select the columns you want to use you can use a pipe to the cut function:

join file1.txt file2.txt -j2 | cut -f2,3,4,5