Question

When sending

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. 

in an email on one line as quoted above, thunderbird converts it to this:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy 
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam 
voluptua.

I believe, this somehow has to do with this format=flowed header:

Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit

When displaying the mail in thunderbird, it looks perfectly okay. The line is displayed as one line. However, when parsing it using pythons Message.get_payload the newlines are still displayed, completely destroying the readability.

How can I make python convert these 'flowed' lines of text to normal ones?

Was it helpful?

Solution

Use the formatflowed library to convert such text to 'regular' text:

from formatflowed import convertToWrapped

text = convertToWrapped(msg.get_payload(), character_set=msg.get_charset())

Do note that you need to pass in a byte string, not a unicode value; the library decodes to Unicode for you.

Obligatory disclaimer: I am the author of that library, albeit quite some time ago now.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top