在过去几天我一直在写剧本,分析自动产生的帮助台票和存储其内容的数据库。测试时我遇到了几个电子邮件,看来是编码而造成的剧本是要失败的。下面是一个例子之一RFC822s:

"[(b'9255(RFC822{12558}',b'Delivered:XXXXXXXXX Received:通过10.220.77.132与SMTP id g4csp176213vck; Mon,28日2014年09:37:05-0700(PDT) -x-收到:通过10.67.30.130与SMTP id ke2mr39896936pad.44.1406565425185; Mon,28日2014年09:37:05-0700(PDT) Return-路径: Received:从XXXXXXXXX(XXXXXXXXX[74.125.149.112]) 通过XXXXXXXXX与SMTP id yh3si18379315pab.170.2014.07.28.09.37.04 ; Mon,28日2014年09:37:04-0700(PDT) Received-SPF:没有(XXXXXXXXX:XXXXXXXXX不指定允许发送者主持)的客户机-ip=74.125.149.141; Authentication-结果:XXXXXXXXX; spf=中性(XXXXXXXXX:XXXXXXXXX不指定允许发送者主持)v Received:从XXXXXXXXX([74.125.149.141])通过XXXXXXXXX([74.125.148.10])与SMTP; Mon,28日2014 16:37:04GMT Received:从XXXXXXXXX([209.85.213.178])(使用TLSv1)通过XXXXXXXXX([74.125.148.12])与SXXXXXXXXX;Mon,28日2014年09:37:04PDT Received:通过XXXXXXXXX与SMTP id uq10sf3897971igb.11 ;Mon,28日2014年09:37:03-0700(PDT) -x-谷歌器-签署:v=1;a=rsa-sha256;c=轻松/放松; d=1e100.net;s=20130820; h=x-gm-消息国家:mime-version:自:日期:主题:信息-id :x-原始发送者:x-原来认证的-结果:优先级 :邮寄名单:名单-id:列表的职位:列助:列-档案 :列-取消订阅:content-type:内容传送编码; bh=H+FlcmWQAFURCHnDFK/bNHUOvofUAPB8bcDYlBceyxE=; b=LoR8D1MK8eoDG9DLkP9gkfR82+EGUIEeOTLpqymqxyx9HJl0C9BW6iwPD7ogrjfbv4 xWYumML6RCinpcZc4d6VCDSw+akXLdhiol+lbWJBZWvgN4BQPgHJwCF6EaHYf3h8j4tq /KAZIkXowz4/WKW8STri4BVjlA2a4LPwV/wazP+I9Kvr1yz433ymd+iCY1V0NexTI+cb 9m3IyL8sqB0+Efyu+XQrR2y7ZdXDPwdzGS/WNHJBtKga5xPDtPga+21pozVMCbuCc/cj Cx9me6cVo19PrNKIOtSimDZ1u6ELdpVr4ipryqsat8aryyicphje34ofplqsptxjm1ei ngyg== -x-Gm-消息国家:ALoCoQkb908wRLWedDE+CtRzjD6VwC6Nja6duttyoVAdf+TFFn+uCxFB0Kwd5jk411YWdMD2G6HuFeRj2y3q7ezte/vTvPLfymDIkHwZQa1r1zQ8I1B254t6v01ourr8inf/41aPGnnD X-Received:通过10.42.48.74与SMTP id r10mr26049776icf.18.1406565423564; Mon,28日2014年09:37:03-0700(PDT) -x-收到:通过10.42.48.74与SMTP id r10mr26049775icf.18.1406565423537; Mon,28日2014年09:37:03-0700(PDT) -x-BeenThere:XXXXXXXXX Received:通过10.50.153.15与SMTP id vc15ls1961411igb.42.gmail;Mon,28Jul 2014年09:37:03-0700(PDT) -x-收到:通过10.66.254.37与SMTP id af5mr39703901pad.113.1406565423331; Mon,28日2014年09:37:03-0700(PDT) Received:从XXXXXXXXX(XXXXXXXXX[74.125.149.158]) 通过XXXXXXXXX与SMTP id da9si9190520pdb.425.2014.07.28.09.37.02 ; Mon,28日2014年09:37:03-0700(PDT) Received-SPF:没有(XXXXXXXXX:XXXXXXXXX不指定允许发送者主持)的客户机-ip=207.211.31.47; Received:从XXXXXXXXX([207.211.31.47])通过XXXXXXXXX([74.125.148.10])与SMTP; Mon,28日2014 16:37:02GMT Received:从XXXXXXXXX(XXXXXXXXX [129.135.112.43])(使用TLS)通过XXXXXXXXX;Mon,28Jul 2014年12:37:01-0400 Received:从XXXXXXXXX(129.135.128.210)通过XXXXXXXXX (129.135.112.45)与Microsoft SMTP Server id14.3.181.6;Mon,28日2014年 11:36:58-0500 Received:从ITSDC50([127.0.0.1])通过XXXXXXXXX与微软 SMTPSVC(6.0.3790.4675); Mon,28日2014年11:36:58-0500 MIME-Version:1.0 From: : Date:Mon,28日2014年11:36:58-0500 Subject:派遣这/关心的情况:SC-118656-7031 Message-ID: -x-OriginalArrivalTime:28Jul2014 16:36:58.0498(UTC)FILETIME=[26792E20:01CFAA82] -x-MC-独一无二的:114072812370105901 -x-换的水平:(S:85.19264/99.90000 简历:99.9000FC:95.5390LC:95.5390R:95.9108 P:95.9108 M:97.0282 C:98.6951 ) -x-换器:0skipped:未启用 -x-换设置:1(0.1500:0.1500)简历gt6gt5gt4gt3gt2gt1 -x-换地址:从[1094/49] -x-换nxpr:disp=中性的,envrcpt=XXXXXXXXX -x-换浦:bodyHash=9500f76054cf97c2a0eec20f8940768958faf6c3,headerHash=eb9362a172738328a8b8a8ae406c42a63f5545f9,键名=4,rcptHash=e0dd4695780dcb1818e78b482447ac976870bcbe,品支持并采用访问策略语言=207.211.31.47,version=1 -x-原始发送者:XXXXXXXXX -x-原来认证的-结果:XXXXXXXXX;防晒指数=性 (XXXXXXXXX:XXXXXXXXX不指定允许发送者 主持)smtp。邮件=XXXXXXXXX Precedence:表 Mailing-清单:表XXXXXXXXX接触XXXXXXXXX List-ID: -x-谷歌-Group Id:511158325204 List-职:, List-帮助:, List-档案: List-取消订阅:, Content-Type:text/plain;charset=UTF-8 Content-Transfer-Encoding:base64 -x-换海王星:于0/0/0.00/0 -x-换的水平:(S:65.87536/99.90000 简历:99.9000FC:95.5390LC:95.5390R:95.9108 P:95.9108 M:97.0282 C:98.6951 ) -x-换器:0skipped:未启用 -x-换设置:5(2.0000:0.0200)s cv fc lc gt6gt5gt4GT3gt2gt1英尺lt r p m c -x-换地址:从[db-null] -x-换nxpr:disp=中性的,envrcpt=XXXXXXXXX -x-换浦:bodyHash=45f4f2e59005199791055b3d1f937e1d3fb7d7ca, headerHash=ca981838d5783da04d9d38e3fffc3f5907100fcf, keyName=4, rcptHash=4f3dee680a09495dc5b095849a4225f49c4a45f4, sourceip=74.125.149.141, version=1 Q2FzZSBOdW1iZXI6ICAgICAgICAgU0MtMTE4NjU2LTcwMzENClNldmVyaXR5IExldmVsOiAg ICAgIE5vcm1hbA0KQWNjb3VudCBOYW1lOiAgICAgICAgSENSIE1hbm9yY2FyZQ0KU2l0ZSBO YW1lOiAgICAgICAgICAgMzAxDQpDbGllbnQgTmFtZTogICAgICAgICBBbWFuZGEgUGVucm9k DQpDbGllbnQgUGhvbmU6ICAgICAgICANCkNsaWVudCBNYWlsUGF0aDogICAgIGFtYW5kYS5w ZW5yb2RAaGNyLW1hbm9yY2FyZS5jb20NCkNhc2UgUHJvZHVjdDogICAgICAgIEhDUi1GaWVs ZCBEZXBsb3ltZW50DQpDYXNlIEtleXdvcmQ6ICAgICAgICBGRC1BU0QNCg0KDQoNClBsZWFz ZSBDbGljayBCZWxvdyB0byBVcGRhdGUgQ2FzZTogDQoNCg0KUHJvYmxlbSBEZXNjcmlwdGlv bg0KKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioNCjw8LSBUaGlzIENhc2UgaXMgYSBTdWItQ2FzZSBvZjogRU0tMTE4NjU2LTcw MTcgIC0+Pg0KDQpQbGVhc2UgZGlzcGF0Y2ggd2lyaW5nIHRlY2ggdG8gaW5zdGFsbCB0d28g bmV3IG5ldHdvcmsgZHJvcHMuIE9uZSBpbiB0aGUgTnVyc2UgTWFuYWdlIE9mZmljZSBhbmQg b25lIGluIHRoZSBDYXNlIE1hbmFnZW1lbnQgT2ZmaWNlDQoNCkxvY2F0aW9uIG9mIGRyb3Ag aXM6ICAgICAgIE51cnNlIE1hbmFnZXIgT2ZmaWNlICYgQ2FzZSBNYW5hZ2VtZW50IE9mZmlj ZQ0KUGhvbmUgRXh0IChJZiBQaG9uZSBEcm9wKTogbi9hDQoNCk9ubHkgQ2F0NWUgUGxlbnVt IFJhdGVkIChDTVApIGNhYmxlIGNhbiBiZSB1c2VkIGZvciBuZXcgZHJvcHMuIEFkZGluZyBS YWNld2F5L1dpcmVtb2xkIGlzIG5vdCBhbiBvcHRpb24gd2l0aG91dCBwcmlvciBhcHByb3Zh bC4gSWYgUmFjZXdheS9XaXJlbW9sZCBpcyByZXF1aXJlZCwgcGxlYXNlIG5vdGlmeSB5b3Vy IGJ1eWVyIGFuZCByZXF1ZXN0IHRoZXkgb2J0YWluIGFwcHJvdmFsLiBTaW5nbGUgZ2FuZyBm YWNlLXBsYXRlIHNob3VsZCBiZSB1c2VkIChzdXJmYWNlIG1vdW50IGJveGVzIHNob3VsZCBu b3QgYmUgdXNlZCB1bmxlc3MgaW5zdGFsbGluZyBhIFdBUCwgUE9DIHNjcmVlbiwgb3IgZ2l2 ZW4gY3VzdG9tZXIgYXBwcm92YWwpLiANCg0KRGF0YSBMYWJlbGluZzoNCi0tLS0tLS0tLS0t LS0tDQpXYWxsIEphY2sgQXJlYToNCkVhY2ggd2FsbCBqYWNrIHdpbGwgYmUgbGFiZWxlZCBp biBzZXF1ZW5jZSBmb3IgaWRlbnRpZmljYXRpb24gcHVycG9zZXMuDQpBbGwgbGFiZWxzIHdp bGwgYmUgY29tcHV0ZXIgZ2VuZXJhdGVkLg0KVGhlIGxhYmVsaW5nIHNlcXVlbmNlIHdpbGwg YmU6DQpDbG9zZXQsIFJhY2ssIFBhdGNoIFBhbmVsLCBQYXRjaCBQYW5lbCBQb3J0Li4uDQpF eGFtcGxlIG9mIGhvdyBkcm9wIHdvdWxkIGJlIGxhYmVsZWQ6IDEtQi0xLTI0DQpUaGUgYWJv dmUgbGFiZWwgd291bGQgcmVwcmVzZW50OiANCiAgICAgQ2xvc2V0IDENCiAgICAgUmFjayBC IA0KICAgICBQYXRjaCBQYW5lbCAxIA0KICAgICBQYXRjaCBQYW5lbCBQb3J0IDI0DQoNCkRh dGEgTURGL0lERiBsYWJlbGluZzoNCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t DQpFYWNoIFJhY2sgd2lsbCBiZSBsYWJlbGVkIOKAnENsb3NldCB4IFJhY2sgeeKAnSAoeCA9 IDEtNC4gQ2xvc2V0IDEgc2hvdWxkIGJlIHRoZSBNREYuIENsb3NldCAyIHNob3VsZCBiZSBJ REYjMSxldGPigKYpIEVhY2ggUGF0Y2ggUGFuZWwgd2lsbCBiZSBsYWJlbGVkIOKAnFBhdGNo IFBhbmVsIHjigJ0gKHggPSAxLTQuKQ0KDQpTd2l0Y2hlcyBzaG91bGQgYmUgbGFiZWxlZCBB LUY6IFN3aXRjaCBBLCBTd2l0Y2ggQiwgZXRjLi4uDQoNCklmIFJhY2tzLCBQYXRjaCBQYW5l bHMsIGFuZCBzd2l0Y2hlcyBhdCB5b3VyIGRlc3RpbmF0aW9uIGFyZSBub3QgcHJvcGVybHkg bGFiZWxlZCwgcGxlYXNlIGNhbGwgSU5HUiBjb250YWN0IHRvIHByb3Blcmx5IGlkZW50aWZ5 IGVhY2ggY2xvc2V0LCByYWNrLCBhbmQgcGF0Y2ggcGFuZWwgaW4gb3JkZXIgdG8gaGF2ZSB0 ZWNobmljaWFuIHByb3Blcmx5IGxhYmVsIGVhY2guIElOR1IgdGVjaCBjYW4gYWxzbyBoZWxw IGxvY2F0ZSBhdmFpbGFibGUgc3dpdGNoIHBvcnRzIGlmIGFsbCBhcHBlYXJzIGZ1bGwuDQoN ClBhdGNoIGRyb3AgZnJvbSBwYXRjaCBwYW5lbCB0byBmYWNpbGl0eSBzd2l0Y2ggYW5kIGZy b20gd2FsbCBqYWNrIHRvIG5ldHdvcmsgZGV2aWNlLiBQcm92aWRlIHBhdGNoIGNhYmxlIGZv ciBjb21wdXRlciBvciBuZXR3b3JrIGRldmljZSB0byB3YWxsIGphY2sgaWYgbmVlZGVkLg0K DQoqKioqKkRBV EEgSk9CIERFTElWRVJBQkxFUyoqKioqOg0KLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0NCkRvd25sb2FkIGFuZCBzZW5kIGRyb3AgdGVzdCByZXN1bHRzIChp ZiB5b3UgZG8gbm90IGhhdmUgYSBtYWNoaW5lIGNhcGFibGUgb2YgZG93bmxvYWRpbmcgdGVz dCByZXN1bHRzLCB0YWtlIHBpY3R1cmVzIG9mIHlvdXIgbGl2ZSB0ZXN0ZXIgc2hvd2luZyB0 aGUgZHJvcCBwYXNzZXMpIGFuZCBhIGRpZ2l0YWwgcGhvdG8gb2YgZWFjaCBwcm9wZXJseSBs YWJlbGVkIHdhbGwgamFjayANDQphbmQgMSBkaWdpdGFsIHBob3RvIG9mIHBhdGNoIHBhbmVs IHRoYXQgc2hvd3MgeW91ciBqb2IgaXMgcHJvcGVybHkgbGFiZWxlZCBhbmQgdGFnZ2VkIGFu ZCBlbWFpbCB0byBBU0QgY29udGFjdC4NCg0KUGxlYXNlIGNhbGwgSU5HUiBjb250YWN0IHRv IGRpc2N1c3MgYW55IGlzc3VlcyB3aXRoIGpvYi4NCg0KSUYgV09SSyBJUyBDQU5DRUxMRUQg T1IgQ09NUExFVEUgVVBPTiBBUlJJVkFMIFBMRUFTRSBPQlRBSU4gUkVRVUVTVEVEIERFTElW RVJBQkxFUyBQUklPUiBUTyBMRUFWSU5HIFNJVEUuDQoNCklOR1IgQ29udGFjdCBpbmZvOg0K UmljayBNYXJ0aW4gYXQgODAwLTYwMy01NTAwIGV4dC4gNTExMSAobHVuY2ggMTowMHBtIC0g MjowMHBtIEVTVCkNClJpY2sgWWFuY2V5IGF0IDgwMC02MDMtNTUwMCBleHQuIDUxMTUgKGx1 bmNoIDI6MDBwbSAtIDI6MzBwbSBFU1QpDQpEb3VnIEpvaG5zb24gYXQgODAwLTYwMy01NTAw IGV4dC4gNTIwMg0KU3RldmUgSmFrdWJpayBhdCA4MDAtNjAzLTU1MDAgZXh0LiA1NDU2DQpM b2dhbiBIYWdhIGF0IDgwMC02MDMtNTUwMCBleHQuIDU0NzYNClRyYXZpcyBCYWlsZXkgYXQg ODAwLTYwMy01NTAwIGV4dC4gNTIwOQ0KSXNhYWMgRGlja3NvbiBhdCA4MDAtNjAzLTU1MDAg ZXh0LiA1MTk4DQoNCk9OTFkgSUYgWU9VIEFSRSBVTkFCTEUgVE8gUkVBQ0ggSU5HUiBjb250 YWN0LCBhZnRlciBsZWF2aW5nIG1lc3NhZ2VzIGZvciBlYWNoIHBlcnNvbiBsaXN0ZWQgYWJv dmUgYW5kIHdhaXRpbmcgNSBtaW51dGVzIGZvciBhIHJldHVybiBjYWxsLCBjb250YWN0Og0K VG9ueSBCdXRsZXIgYXQgODAwLTYwMy01NTAwIGV4dC4gNTE0MA0KQmFyYiBFZHdhcmRzIGF0 IDgwMC00MjctMTkwMiBleHQuIDUxODMNCkRhdmUgSGlyZSBhdCA4MDAtNDI3LTE5MDIgZXh0 LiA2NDE4DQoNCkFTRCBBZnRlciBIb3VycyBudW1iZXIgaXMgODI4LTYyNC0xMDk5IGFuZCBl bWFpbCBmb3IgdGhpcyBhY2NvdW50IGlzIHRlYW1pbnRlcmdyYXBoQGFzZC11c2EuY29tDQoN CjwwNy8yOC8xNCAxMTozNiBDYXNlIG9wZW5lZCBieTogIHJtYXJ0aW4gKDI1Nik3MzAtNTEx MT4NCg0KDQpQcm9ibGVtIFNvbHV0aW9uDQoqKioqKioqKioqKioqKioqKioqKioqKioqKioq KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKg0KUmVmZXJyZWQgZm9yIFJlc29sdXRp b24gVG86IEFTRA0KPDA3LzI4LzE0IDExOjM2IENhc2UgZWRpdGVkIGJ5OiBybWFydGluICgy NTYpNzMwLTUxMTE+DQoqfip+Kn4qfip+Kn4qfip+Kn4qfip+Kn4qfip+Kn4qfip+Kn4qfip+ Kn4qfip+Kn4qfip+Kg0KDQoNCg0KDQpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX18NCk5vdGljZSByZXF1aXJlZCBieSBsYXc6ICBUaGlzIGVtYWls IG1heSBjb25zdGl0dXRlIGFuIGFkdmVydGlzZW1lbnQgb3Igc29saWNpdGF0aW9uIHVuZGVy IFUuUy4gbGF3IGlmIGl0cyBwcmltYXJ5IHB1cnBvc2UgaXMgdG8gYWR2ZXJ0aXNlIG9yIHBy b21vdGUgYSBjb21tZXJjaWFsIHByb2R1Y3Qgb3Igc2VydmljZS4gIFlvdSBtYXkgY2hvb3Nl IG5vdCB0byByZWNlaXZlIGFkdmVydGlzaW5nIGFuZCBwcm9tb3Rpb25hbCBtZXNzYWdlcyBm cm9tIEFTRCAoZXhjZXB0IGZvciB3d3cuYXNkLXVzYS5jb20sIHdoaWNoIG1hbmFnZXMgZW1h aWwgcHJlZmVyZW5jZXMgdGhyb3VnaCBhIHNlcGFyYXRlIHByb2Nlc3MpIGF0IHRoaXMgZW1h aWwgYWRkcmVzcyBieSBmb3J3YXJkaW5nIHRoaXMgbWVzc2FnZSB0byBsZWF2ZW1lYWxvbmVA YXNkLXVzYS5jb20uICBJZiB5b3UgZG8gc28sIHRoZSBzZW5kZXIgb2YgdGhpcyBlbWFpbCB3 aWxsIGJlIG5vdGlmaWVkIHByb21wdGx5IGFuZCB5b3Ugd2lsbCBub3QgYmUgY29udGFjdGVk IGFnYWluLiAgT3VyIHByaW5jaXBhbCBwb3N0YWwgYWRkcmVzcyBpcyA3NzUgR29kZGFyZCBD b3VydCBBbHBoYXJldHRhLCBHQSAgMzAwMDUuDQoNCg== '), b')']"

是身体的电子邮件编码的?如果那样,我应该怎么处理解码吗?

有帮助吗?

解决方案

身体是编码(内容Transfer-Encoding:base64),这是不一样的,因为加密的。粘贴的第一组字进入一个在线解码器

Q2FzZSBOdW1iZXI6ICAgICAgICAgU0MtMTE4NjU2LTcwMzENClNldmVyaXR5IExldmVsOiAg

得到的解码为

Case Number:         SC-118656-7031
Severity Level:  

蟒蛇有库base64decode,但我会失望的如果imaplib没有一个系统简化了这一点。

其他提示

你可以使用 email 包这一点。你有一个列表那里,和第一个项目的清单元组成,其中第二个要素是整个电子邮件信息。让我们说你有那个字节的对象在一个变量所谓 msg_bytes.然后你可以分析的消息的使用:

import email.parser
msg = email.parser.BytesParser().parsebytes(msg_bytes)

然后你可以访问不同的部件的消息(参阅文件 email.message.Message):

# get a bytes object containing the base64-decoded message
textbytes = msg.get_payload(decode=True)

# get the content charset
content_charset = msg.get_content_charset()

# decode the text to obtain a string object
text = textbytes.decode(content_charset)

这将能够处理大多数情况下,如果不是所有的、有效的电子邮件。

尝试 Imbox, 在这里你不需要修复解码器

imaplib 是一个非常过度的低水平图书馆和返回的结果是困难的工作

安装

pip install imbox

使用

from imbox import Imbox

with Imbox('imap.gmail.com',
        username='username',
        password='password',
        ssl=True,
        ssl_context=None,
        starttls=False) as imbox:

    all_inbox_messages = imbox.messages()
    for uid, message in all_inbox_messages:
        message.sent_from
        message.sent_to
        message.body
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top