Okay, your code is kind of headed in the right direction, but you have a few things decidedly confused.
You need to separate what your script is doing into two logical steps: one, aggregating (counting) all of the clu
fields. Two, writing each field that has a clu
count of > 1. You tried to do these steps together at the same time and.. well, it didn't work. You can technically do it that way, but you have the syntax wrong. It's also terribly inefficient to continuously search through your file for stuff. Best to only do it once or twice.
So, let's separate the steps. First, count up your clu
fields. The collections
module has a Counter
that you can use.
from collections import Counter
with open(infilename, 'r') as infile:
c = Counter(line.split()[0] for line in infile)
c
is now a Counter
that you can use to look up the count of a given clu
.
with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
for line in infile:
clu, gen, spec, fam = line.split()
if c[clu] > 1:
outfile.write(line)