Question

I have a tsv file like this with headers.

Header1 header2 header3 header4 header5
Aa      bb      dd      cc      aa      
Bb      bb      aa      cc      bb    
Cc      bb      cc      dd      aa    
Aa      bb      ee      cc      dd    
Aa      vv      ff      gg      ii

I have a dictionary like {‘0’: ‘aa’, ‘1’:’bb’,’3’:’cc’}

I am supposed to parse through this file, and return rows from the file where every column of index 0 is aa, index 1 is bb and index 3 is cc. In other words, I need to get all the rows where first column is aa, second column is bb and 4th column is cc. So I should be able to print the first 1st and the 4th row from the tsv file, which are

Aa  bb  dd  cc  aa
Aa  bb  ee  cc  dd

My code snippet does not give the intersection of all these conditions but gives the all the rows where each one of the condition satisfies. Please help me correct my script. The dictionary specified above is named as index dict.

data=csv.reader(open(tsvfile,'rb'),delimiter = "\t")
            fields =data.next()
            print "-------------------------Rows Filtered-------------------------"
            for key,value in indexdict.items():

              for row in data:

                            if row[key]== value:
                                    linecount=linecount+1
                                    print row`                  
Was it helpful?

Solution

The all builtin function is what you need:

for row in data:
    if all(row[key] == value for key, value in indexdict.items()):
        print row
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top