Why does pthread_mutex_t segfault when trying to lock through shared memory from two different processes?

StackOverflow https://stackoverflow.com/questions/22800016

Question

I wrote a super simple wrapper for a pthread_mutex_t meant to be used between two processes:

//basic version just to test using it between two processes
struct MyLock
{
    public:
        MyLock() {
            pthread_mutexattr_init(&attr);
            pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
            pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ADAPTIVE_NP);

            pthread_mutex_init(&lock, &attr);
        }

        ~MyLock() {
            pthread_mutex_destroy(&lock);
            pthread_mutexattr_destroy(&attr);
        }

        lock() {
            pthread_mutex_lock(&lock);
        }

        unlock() {
            pthread_mutex_unlock(&lock);
        }

    private:
        pthread_mutexattr_t attr;
        pthread_mutex_t lock;
};

I am able to see this lock work fine between regular threads in a process but when I run process A which does the following in a shared memory region:

void* mem; //some shared memory from shm_open
MyLock* myLock = new(mem) MyLock;
//loop sleeping random amounts and calling ->lock and ->unlock

Then process B opens the shared memory object (verified by setting it with combinations of characters that it's the same region of memory) and does this:

MyLock* myLock = reinterpret_cast<MyLock*>(mem);
//same loop for locking and unlocking as process A

but process B segfaults when trying to lock with the backtrace leading to pthread_mutex_lock() in libpthread.so.0

What am I doing wrong?

The backtrace I get from process B looks like this:

in pthread_mutex_lock () from /lib64/libpthread.so.0
in MyLock::lock at MyLock.H:50
in Server::setUpSharedMemory at Server.C:59
in Server::Server at Server.C
in main.C:52

The call was the very first call to lock after reinterpret casting the memory into a MyLock*. If I dump the contents of MyLock in gdb in the crashing process I see:

{
attr = {
    __size = "\003\000\000\200",
    __align = -2147483645
},
lock = {
    __data = {
      __lock = 1
      __count = 0,
      __owner = 6742, //this is the lightweight process id of a thread in process A
      __nusers = 1,
      __kind = 131,
      __spins = 0,
      __list = {
        __prev = 0x0,
        __Next = 0x0
       }
      },
      __size = "\001\000\000\000\000 //etc,
      __align = 1     
  }
}

so it looks alright (looks like this in the other process gdb as well). I am compiling both applications together using no additional optimization flags either.

Was it helpful?

Solution

You didn't post the code to open and initialize a shared memory region but I suspect that part might be responsible for your problem.

Because pthread_mutex_t is much larger than "combination of characters," you should check your shm_open(3)-ftruncate(2)-mmap(2) sequence with reading and writing a longer (~ KB) string.

Dont't forget to check both endpoints can really write to the shm region and the written data is really visible to the other side.

Process A: [open and initialize the shm]-[write AAA...AA]-[sleep 5 sec]-[read BBB...BB]-[close the thm]

Process B: (a second or two later) [open the shm]-[read AAA...AA]-[write BBB...BB]-[close the thm]

OTHER TIPS

I have a similar issue where the writer Process is root and the Readers Processes are regular users (case of a hardware daemon). This would segfault in Readers as soon as any pthread_mutex_lock() or pthread_cond_wait() and their unlock counterparts were called.

I solved it by modifying the SHM file permissions using an appropriated umask:

Writer

umask(!S_IRUSR|!S_IWUSR|!S_IRGRP|!S_IWGRP|!S_IROTH|!S_IWOTH);
FD=shm_open("the_SHM_file", O_CREAT|O_TRUNC|O_RDWR, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH);
ftruncate(FD, 28672);
SHM=mmap(0, 28672, PROT_READ|PROT_WRITE, MAP_SHARED, FD, 0);

Readers

FD=shm_open("the_SHM_file", O_RDWR, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH);
SHM=mmap(0, 28672, PROT_READ|PROT_WRITE, MAP_SHARED, A.FD, 0);

You don't say what OS you are using, but you don't check the return value of the pthread_mutexattr_setpshared call. It's possible your OS does not support shared mutexes and this call is failing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top