Assuming you're asking what I think you're asking, here's what to do:
First, fetch the BODYSTRUCTURE
. Assuming gmail's IMAP server supports this, you'll get back something like this:
(("TEXT" "PLAIN" ("CHARSET" "UTF-8") NIL NIL "QUOTED-PRINTABLE" 56 1 NIL NIL NIL NIL)
("TEXT" "HTML" ("CHARSET" "UTF-8") (NAME "") NIL NIL "BASE64" 12345 NIL
("attachment" ("FILENAME" "")) NIL NIL)
("IMG" "JPEG" (NAME "funny picture") NIL NIL "BASE64" 56789 NIL
("attachment" ("FILENAME" "image.jpg")) NIL NIL))
"MIXED" ("BOUNDARY" "----_=_NextPart_001_1234ABCD.56789EF0") NIL NIL NIL)
And then fetch the (BODY ENVELOPE)
is the structure has one.
If you look at RFC3501 7.4.2, it explains how to deal with these.
Once you've determined that the (BODY[1])
and (BODY[2])
are the plain-text and HTML versions of the main content, and (BODY[3])
is the first real attachment, you download the plain-text body by fetching (BODY[1])
, and you've got the name of the attachment from the structure.
Sorry there's no code here. I don't think either imaplib
or any of the stdlib MIME- and mail-related modules will do the hard part for you (interpreting the structure), but I haven't actually checked, so I'd look there first, and, if not, go to PyPI to see if anyone else has already written the code.
Well, actually, first I'd just fetch BODYSTRUCTURE
, (BODY ENVELOPE)
and (BODY[3])
for a specific message to make sure gmail has complete support before writing a whole mess of code…
PS, if worst comes to worst, if your use case is as simple and rigid as you described, you can just always fetch BODYSTRUCTURE
and (BODY[1])
, fall back to RFC822
if that fails, and get the attachment names by running a hacky regexp on the structure instead of a real parse. I wouldn't write this for anything but a one-shot script or a quick&dirty prototype to learn about gmail, but for those cases, I probably would.