Question

I'm getting a segmentation violation when trying to intercept calls via LD_PRELOAD from Cython. I don't understand why though?

"""An experiment in shimming from Cython / Python."""

cdef extern from "dlfcn.h":
    void* dlsym(void*, char*)
    void* RTLD_NEXT

cdef extern int execvp(const char *file, char *const argv[]) with gil:
    print "Intercepted lookup of %r" % file
    libc_execvp = dlsym(RTLD_NEXT, "execvp")
    if libc_execvp:
        with nogil:
            return (<int(*)(const char*, char * const *) nogil>libc_execvp)(file, argv)
    return -1

The project with an example test case is available at https://github.com/CraigJPerry/pyshim/blob/master/pyshim/pyshim.pyx

I believe the python runtime may not be correctly initialised and this is the root of my issue?

[craig@d1 pyshim](master)$ gdb env
..
Reading symbols from /usr/bin/env...Reading symbols from /usr/bin/env...(no debugging symbols found)...done.
(gdb) set environment LD_PRELOAD=pyshim/pyshim.so
(gdb) set args echo
(gdb) run
Starting program: /usr/bin/env echo
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x000000396ee0ddb0 in sem_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x000000396ee0ddb0 in sem_wait () from /lib64/libpthread.so.0
#1  0x0000003bbcf0c7b5 in PyThread_acquire_lock ()
   from /lib64/libpython2.7.so.1.0
#2  0x0000003bbcefad80 in ?? () from /lib64/libpython2.7.so.1.0
#3  0x0000003bbcefb62c in PyGILState_Ensure () from /lib64/libpython2.7.so.1.0
#4  0x00007ffff7df9519 in execvp (__pyx_v_file=0x7fffffffe85b "echo", 
    __pyx_v_argv=0x7fffffffe4d0) at pyshim/pyshim.c:681
#5  0x0000000000401a82 in main ()
Was it helpful?

Solution

Cython assumes you have a functional Python interpreter running (i.e. that you are writing an extension module). In this case, you are embedding Python, rather than extending it. So you need to do some extra work to initialize everything.

Luckily, this is not really a challenging problem:

"""An experiment in shimming from Cython / Python."""

cdef extern from "dlfcn.h":
    void* dlsym(void*, char*)
    void* RTLD_NEXT

cdef extern from "Python.h":
    void Py_Initialize() nogil

cdef extern void initpyshim()

cdef extern int execvp(const char *file, char *const argv[]) nogil: # note nogil here
    Py_Initialize() # initialize Python
    with gil:
        initpyshim() # initialize containing module
        print "Intercepted lookup of %r" % file
        libc_execvp = dlsym(RTLD_NEXT, "execvp")
        if libc_execvp:
            with nogil:
                return (<int(*)(const char*, char * const *) nogil>libc_execvp)(file, argv)
        return -1

If your function may be called multiple times, you may want to avoid re-initializing your module (you can do this by checking Py_IsInitialized()). You may also need to call Py_Finalize() before leaving your method.

If you are targeting Python 3, the init method is called PyInit_<modname> and returns a PyObject * reference that you need to hold on to until the end of the method (at the very least).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top