Tracking how many elements processed in generator

Question 1

You could make your generator a class with an attribute that contains a count of the number of records it has processed. Something like this:

class RecordProcessor(object):
    def __init__(self, recs):
        self.recs = recs
        self.processed_rec_count = 0
    def __call__(self):
        for r in self.recs:
            if r.sender('sender_of_interest'):
               self.processed_rec_count += 1
               # process record r...
               yield r  # processed record

def process_all_records(files):
    for f in files:
        fd = open(f,'r')
        recs_p = RecordProcessor(read_records(fd))
        write_records(recs_p)
        print 'records processed:', recs_p.processed_rec_count

Question 2

Here's the straightforward approach. Is there some reason why something this simple won't work for you?

seen=0
matched=0

def process_records(r):
    seen = seen + 1
    if r.sender('sender_of_interest'):
       matched = match + 1
       records_list.append(1)
    else:
       records_list.append(0)

    if seen > 1000 or someOtherTimeBasedCriteria:
       print "%d of %d total records had the sender of interest" % (matched, seen)
       seen = 0
       matched = 0

If you have the ability to close your stream of messages and re-open them, you might want one more total seen variable, so that if you had to close that stream and re-open it later, you could go to the last record you processed and pick up there.

In this code "someOtherTimeBasedCriteria" might be a timestamp. You can get the current time in milliseconds when you begin processing, and then if the current time now is more than 20,000ms more (20 sec) then reset the seen/matched counters.