from __future__ import with_statement # if you need it
import csv
with open('file_with_header_columns', 'r') as hapinfile,
open('file_missing_header_columns', 'r') as hapoutfile,
open('filescombined.txt', 'w') as outfile:
good_data = csv.reader(hapoutfile, delimiter='\t')
bad_data = csv.reader(hapinfile, delimiter='\t')
out_data = csv.writer(outfile, delimiter='\t')
for data_row in good_data:
for header_row in bad_data:
if header_row[0] == data_row[0]
out_data.writerow(data_row)
break # stop looking through headers
You seem to have a really unfortunate problem here in that you have to do nested loops to find your data. If you could do something like sort the CSV files by header fields, you could get more efficiency. As it is, take advantage of the CSV module and condense everything. You can make use of break
which, while a bit odd in a for
loop, will at least "short-circuit" you out of the search through the second file once you've found your header.