Question

In my application I have setup signal handler to catch Segfaults, and print bactraces. My application loads some plugins libraries, when process starts.

If my application crashes with a segfault, due to an error in the main executable binary, I can analyze the backtrace with:

addr2line -Cif -e ./myapplication 0x4...

It accurately displays the function and the source_file:line_no

However how do analyze if the crash occurs due to an error in the plugin as in the backtrace below?

/opt/myapplication(_Z7sigsegvv+0x15)[0x504245]
/lib64/libpthread.so.0[0x3f1c40f500]
/opt/myapplication/modules/myplugin.so(_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi+0x6af)[0x7f5588fe4bbf]
/opt/myapplication/modules/myplugin.so(_Z11myplugin_reqmodP12CONNECTION_TP7Filebuf+0x68)[0x7f5588fe51e8]
/opt/myapplication(_ZN10Processors7ExecuteEiP12CONNECTION_TP7Filebuf+0x5b)[0x4e584b]
/opt/myapplication(_Z15process_requestP12CONNECTION_TP7Filebuf+0x462)[0x4efa92]
/opt/myapplication(_Z14handle_requestP12CONNECTION_T+0x1c6d)[0x4d4ded]
/opt/myapplication(_Z13process_entryP12CONNECTION_T+0x240)[0x4d79c0]
/lib64/libpthread.so.0[0x3f1c407851]
/lib64/libc.so.6(clone+0x6d)[0x3f1bce890d]

Both my application and plugin libraries have been compiled with gcc and are unstripped. My application when executed, loads the plugin.so with dlopen Unfortunately, the crash is occurring at a site where I cannot run the application under gdb.

Googled around frantically for an answer but all sites discussing backtrace and addr2line exclude scenarios where analysis of faulty plugins may be required. I hope some kind-hearted hack knows solution to this dilemma, and can share some insights. It would be so invaluable for fellow programmers.

Tons of thanks in advance.

Was it helpful?

Solution

Here are some hints that may help you debug this:

The address in your backtrace is an address in the address space of the process at the time it crashed. That means that, if you want to translate it into a 'physical' address relative to the start of the .text section of your library, you have to subtract the start address of the relevant section of pmap from the address in your backtrace.

Unfortunately, this means that you need a pmap of the process before it crashed. I admittedly have no idea whether loading addresses for libraries on a single system are constant if you close and rerun it (imaginably there are security features which randomize this), but it certainly isn't portable across systems, as you have noticed.

In your position, I would try:

  • demangling the symbol names with c++filt -n or manually. I don't have a shell right now, so here is my manual attempt: _ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi is ICAPSection::process(CONNECTION_T *, Filebuf *, int). This may already be helpful. If not:
  • use objdump or nm (I'm pretty sure they can do that) to find the address corresponding to the mangled name, then add the offset (+0x6af as per your stacktrace) to this, then look up the resulting address with addr2line.

OTHER TIPS

us2012's answer was quite the trick required to solve the problem. I am just trying to restate it here just to help any other newbie struggling with the same problem, or if somebody wishes to offer improvements.

In the backtrace it is clearly visible that the flaw exists in the code for myplugin.so. And the backtrace indicates that it exists at:

/opt/myapplication/modules/myplugin.so(_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi+0x6af)[0x7f5588fe4bbf]

The problem of locating the line corresponding to this fault cannot be determined as simplistically as:

addr2line -Cif -e /opt/myapplication/modules/myplugin.so 0x7f5588fe4bbf

The correct procedure here would be to use nm or objdump to determine the address pointing to the mangled name. (Demangling as done by us2012 is not really necessary at this point). So using:

nm -Dlan /opt/myapplication/modules/myplugin.so | grep "_ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi"

I get:

0000000000008510 T _ZN11ICAPSection7processEP12CONNECTION_TP7Filebufi   /usr/local/src/unstable/myapplication/sources/modules/myplugin/myplugin.cpp:518

Interesting to note here is that myplugin.cpp:518 actually points to the line where the opening "{" of the function ICAPSection::process(CONNECTION_T *, Filebuf *, int)

Next we add 0x6af to the address (revealed by the nm output above) 0000000000008510 using linux shell command

 printf '0x%x\n' $(( 0x0000000000008510 + 0x6af ))

And that results in 0x8bbf

And this is the actual source_file:line_no of the faulty code, and can be precisely determined with addr2line as:

addr2line -Cif -e /opt/myapplication/modules/myplugin.so 0x8bbf

Which displays:

std::char_traits<char>::length(char const*)
/usr/include/c++/4.4/bits/char_traits.h:263
std::string::assign(char const*)
/usr/include/c++/4.4/bits/basic_string.h:970
std::string::operator=(char const*)
/usr/include/c++/4.4/bits/basic_string.h:514
??
/usr/local/src/unstable/myapplication/sources/modules/myplugin/myplugin.cpp:622

I am not too sure why the function name was not displayed here, but myplugin.cpp:622 was quite precisely where the fault was.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top