Question

I want to find out a list of "From" addresses in a Maildir folder. Using the following script, it illustrates the varying formats that are valid in From:

import mailbox

mbox = mailbox.Maildir("/home/paul/Maildir/.folder") 
for message in mbox:
    print message["from"]

"John Smith" <jsmith@domain.com>
Tony <tony@domain2.com>
brendang@domain.net

All I need is the email address, for any valid (or common) "From:" field format. This must have been solved a crazillion times before, so I was expecting a library. All I can find is various regexes.

Is there a standard approach?

Was it helpful?

Solution

email.utils.parseaddr is your friend:

>>> emails = """"John Smith" <jsmith@domain.com>
Tony <tony@domain2.com>
brendang@domain.net"""
>>> lines = emails.splitlines()
>>> from email.utils import parseaddr
>>> [parseaddr(email)[1] for email in lines]
['jsmith@domain.com', 'tony@domain2.com', 'brendang@domain.net']

So you should just be able to work with:

for message in mbox:
    print parseaddr(message['from'])

Then, I guess if you just want unique email addresses, then you can just use a set directly over mbox, eg:

mbox = mailbox.MailDir('/some/path')
uniq_emails = set(parseaddr(email['from'])[1] for email in mbox)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top