Question

We're in the process of converting a C++ openssl based project to python w/ M2Crypto, and we've run into a somewhat unusual issue w/ the BIO routines from M2Crypto. Specifically, any call to BIO.readlines() hangs forever on a file object.

Here's a quick sample of what we tried:

f = open('test.txt','w')
f.write('hello world\n')
f.close()

import M2Crypto.BIO
bio = M2Crypto.BIO.openfile('test.txt','r')
lines = bio.readlines()
# the above call hangs forever

To ensure we didn't have something horribly wrong with our OpenSSL installation, we create a small test program to read the test.txt file we just created

#include <openssl/bio.h>
#include <openssl/err.h>
int main() {
    const int maxrd = 4096;
    char line[maxrd];
    int rd;
    BIO* bio = BIO_new_file("test.txt","r");
    while((rd = BIO_gets(bio, line, maxrd)) > 0) {
        printf("%s",line);
        }
    if (rd == -1) {
        printf("BIO error %ld\n", ERR_get_error());
        }
    }

No problem.

We've been studying the M2Crypto-0.21.1/SWIG/_bio.i wrapper file, and think we might have an idea of the source of the issue. Line 109 tests the return value from BIO_gets()

if (r < 0) {
    // return Py_None
    }

BUT, the man page for BIO_gets() suggests it could return either 0 or -1 to indicate end-of-stream.

I believe it should be

if (r < 1) {
    // return Py_None
    }

But wanted to see if other's had encountered -- or whether we are mistaken in our understanding of the BIO_gets() system.

--- Details --- Pythong 2.7 M2Crypto 0.21.1 OpenSSL 0.9.8q-fips 2 Dec 2010 FreeBSD 8.2-RELEASE-p4

Was it helpful?

Solution

In the event others stumble across this in the future, I wanted to share our patch.

--- M2Crypto-0.21.1.orig/SWIG/_bio.i  2011-01-15 14:10:06.000000000 -0500
+++ M2Crypto-0.21.1/SWIG/_bio.i   2012-02-14 11:34:15.000000000 -0500
@@ -106,7 +106,7 @@
     Py_BEGIN_ALLOW_THREADS
     r = BIO_gets(bio, buf, num);
     Py_END_ALLOW_THREADS
-    if (r < 0) {
+    if (r < 1) {
         PyMem_Free(buf);
         if (ERR_peek_error()) {
             PyErr_SetString(_bio_err, ERR_reason_error_string(ERR_get_error()));

NOTE: For those familiar with the internals of M2Crypto, there were essentially three solutions to this problem. The first is the patch posted above. Since we believe this matches the intention of the man page for BIO_gets(), it's the solution we opted for.

The second solution was to patch M2Crypto/BIO.py. Specifically, to patch the code that implements BIO.readlines() to test the return value from m2.bio.gets() for either None or len(buf) == 0, and treat both as end-of-stream.

The third solution was simply to avoid calling BIO.readlines(), and restrict yourself to calling BIO.readline() (note -- singluar readline vs readlines), and to test the return value from BIO.readline() for either None or len(buf) == 0.

The third solution may not seem like much of an option -- more like avoidance. But if you are concerned about deploying an application into an evironment where M2Crypto may not be patched, this approach is guaranteed to be the most compatible.

We've submitted our patch to the developer of record Heikki, but he hasn't had a chance to review our suggestion yet. Until an official answer one way or the other develops, I wanted to share our thoughts.

OTHER TIPS

This problem relates to bug #717675 in Debian Linux.

It's not reproducible in Fedora 21 and I didn't find any patches for Fedora that would modify either BIO.py or _bio.i.

Here is the patch which was posted for Debian:

--- /usr/lib64/python2.7/site-packages/M2Crypto/BIO.py  2011-01-15 20:10:05.000000000 +0100
+++ BIO.py  2015-05-20 09:24:46.600582999 +0200
@@ -73,6 +73,8 @@
             buf=m2.bio_gets(self.bio, 4096)
             if buf is None:
                 break
+       if len(buf)==0:
+       break
             lines.append(buf)
         return lines
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top