Domanda

I have an application written in C++ in a Linux environment. The app dynamically loads library (shared object) during runtime. (Application gets the user command and it will do the logic to dynamically load the required shared library.)

Is there any way to prevent the application from crashing and exiting when a crash or segfault occurs in the shared library?

I want my application to be active and report the crash to the user.

È stato utile?

Soluzione

As Itwasntpete answered you could set (using sigaction(2) with SA_SIGINFO, don't use signal(2)!) a signal handler for SIGSEGV. However, read carefully signal(7) first.

Notice that if you want to entirely catch SIGSEGV (or other asynchronous signals like SIGBUS, SIGILL, SIGFPE etc...) and continue processing, it is tricky and machine specific. If you return plainly from your SIGSEGV then the machine state is staying the same, and execution comes back to the machine instruction which has triggered the SIGSEGV which becomes re-trigerred ad infinitum (you are stuck in an endless loop).

So to be able to continue execution, you should either not return from your signal handler, or use siglongjmp(3) in it to jump to a state previously registered with sigsetjmp(3), or alter the machine state. To alter the machine state you may change the address space using mmap(2) and related calls, or you may change some [saved] processor registers using the ucontext_t* passed as the third argument to your handler, and querying the detail of the signal info using the siginfo_t* passed as the second argument. How to do this is system specific (it depends upon the operating system and the processor) and tricky.

If you want to show a nice backtrace from your signal handler, consider using e.g. the libbacktrace from inside recent GCC source ball. (It will work much nicer if both the program and the plugin have been compiled with debug info, e.g. with gcc -O -g)

Notice that signal(7) says explicitly that only async-signal-safe functions can be called (directly or indirectly) from a signal handler. So in principle, calling malloc, ::operator new (which is called by most C++ containers!!) or printf from a signal handler is forbidden, and is not wise. However, if you just call a libbacktrace function and then _exit(2) from your signal handler, that would often (but in principle, not always) work.

If you want your application to report an error and stay active (e.g. if your application is a server, to be able continue serving a lot of requests) it probably would be very tricky (and sometimes impossible). For example, if the plugin is buggy to the point of having corrupted the heap, you should clean up the mess (which is not always possible).... In some situations, I would imagine that the only thing to do would be to restart the applcation (e.g. by calling execve(2) from inside the signal handler). Application checkpointing techniques could be relevant: you could design your application to checkpoint periodically and restart from the latest saved state...

In general reliable crash recovery is really difficult, specially for C++ software. You need to understand a lot of implementation details. Using exclusively free software helps a lot: you can study details inside all libraries (even libstdc++ and libc: you may need to understand the internals of implementation of malloc ...).

I am not even sure it is the right approach for plugins. You could perhaps consider helping the plugin developer e.g. by expliciting some well defined application specific coding rules (or programming style) and perhaps developing some GCC compiler extensions e.g. with MELT to check some of them at plugin compilation time.

Altri suggerimenti

Yes it is possible. If a segfault occurs, your program will first receive the SIGSEGV (see signal or since signal is obsolete sigaction(2)). Connecting this signal to a handler allows you to make your crash-report.

void crash(int sig) {
  cout << "report crash";
  exit(sig);
}

int main() {
  // connect signal to handler
  signal(SIGSEGV, crash);

  return 0;
}

As Jonathan Leffler mentioned is his comment, it is just a small suggestion what to do. There are a few signals which should be catched not only SIGSEGV, but maybe also SIGILL, SIGFPE... depending on your application.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top