Question

I've created a small python script to forward approximately 11000 mails (as *.eml) to my mail server. The mails are attached with CSV files, containing data from daily energy measurements.

So, my approach is working correctly, however it takes nearly 20min or so to send the mails! Any ideas to speed this up?

#! /usr/bin/env python
import os, sys
import email, smtplib


storage_folder = "data/export"
storage_header = 0

mail_server = 'server-address'
mail_username = 'mail-user'
mail_password = 'mail-password'

mail_to = 'mail-address-to'
mail_from = 'mail-sender'


def load_local_mails(file_folder):
    #: scan desired folder for *.eml files
    email_body = []

    for dirname, dirnames, filenames in os.walk(file_folder):
        for filename in filenames:
            if filename.endswith('eml'):
                f = open(os.path.join(dirname, filename))
                email_body.append(f.read())
    return email_body


def send_mail(email_body):
    #: send mails
    smtp = smtplib.SMTP(mail_server, 587)
    smtp.starttls()
    smtp.login(mail_username, mail_password)

    for idx, item in enumerate(email_body):
        m = email.message_from_string(item)
        m.replace_header("From", mail_from)
        m.replace_header("To", mail_to)
        m.replace_header("Subject", m['subject'])

        smtp.sendmail(mail_from, mail_to, m.as_string())
        print "Forwarding mail %d/%d: '%s' from '%s' to %s " % (idx+1, len(email_body), m['subject'], m['date'], mail_to)

    smtp.quit()


def main():
    if len(sys.argv) > 1:
        if sys.argv[1] == '--start':
            send_mail(load_local_mails(sys.argv[2]))

    else:
        exit("Don't know what to do? :(. Goodbye")


if __name__ == '__main__':
    main()

Update: I was able to manage a direct (SCP) upload to the specific folder in the qmail environment. I think this is the fastest solution! Anyway, thanks for the suggestions.

Was it helpful?

Solution

Well, you don't provide much data but 20 minutes does not look very unreasonable for such amount of emails with smtp protocol. With smtp you have to exchange 5 messages with server.

  1. helo
  2. mail from
  3. mail to
  4. data
  5. quit

When you assume 10ms ping to your destination that gives 11k * 50ms = 550s for only network latency overhead. That is roughly 10 minutes.

Then you need to actually open and read 11k files it also can take a while, maybe not that much. But you did not specify the size of data you send. Maybe it can account for the rest of the time. Especially if you say you send csv data files.

It will be hard to speed it up not changing technology solutions you use. But I would suggest transmitting data as binary in bigger chunks preferably using compression. Compression should be meaningful for csv files.

OTHER TIPS

Since this is a network bound task you can speed it up by splitting the task in smaller batches and execute them in parallel.

you should get easily down to a a couple of minutes with this approach.

There are plenty of libs around there to do this, I would personally go for gevent but you can stick with multiprocessing / threading if you want to stick with standard libs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top