I have a simple C program which behaves differently when debugged with gdb and not.
The program is this:
#include <stdio.h>
#include <signal.h>
int main() {
kill(getpid(), SIGFPE);
printf("I'm happy.\n");
return 0;
}
When run by itself, I get this very strange result:
ezyang@javelin:~$ ./mini
I'm happy.
ezyang@javelin:~$ echo $?
0
No error! That is not to say that the signal is not being fired, it is:
ezyang@javelin:~$ strace -e signal ./mini
kill(31950, SIGFPE) = 0
--- SIGFPE (Floating point exception) @ 0 (0) ---
I'm happy
When in GDB, things proceed differently:
ezyang@javelin:~/Dev/ghc-build-sandbox/libraries/unix/tests/libposix$ gdb ./mini
GNU gdb (GDB) 7.5.91.20130417-cvs-ubuntu
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /srv/code/ghc-build-sandbox/libraries/unix/tests/libposix/mini...(no debugging symbols found)...done.
(gdb) r
Starting program: /srv/code/ghc-build-sandbox/libraries/unix/tests/libposix/mini
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
Program received signal SIGFPE, Arithmetic exception.
0x00007ffff7a49317 in kill () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) c
Continuing.
Program terminated with signal SIGFPE, Arithmetic exception.
The program no longer exists.
Asking GDB to not stop makes no difference
(gdb) handle SIGFPE nostop
Signal Stop Print Pass to program Description
SIGFPE No Yes Yes Arithmetic exception
(gdb) r
Starting program: /srv/code/ghc-build-sandbox/libraries/unix/tests/libposix/mini
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
Program received signal SIGFPE, Arithmetic exception.
Program terminated with signal SIGFPE, Arithmetic exception.
The program no longer exists.
What's going on?! For one thing, why isn't the SIGFPE killing the program; for the second thing, why is GDB behaving differently?
Update. One thought is that the child process is inheriting the signal masks of the parent. However, as can be seen in this transcript, that clearly is not the case: This analysis was not correct, see below.
ezyang@javelin:~$ trap - SIGFPE
ezyang@javelin:~$ ./mini
I'm happy.
Update 2. A friend of mine points out that trap only reports signals as set by the shell itself, and not by any parent processes. So we tracked down the ignore masks of all the parents, and lo and behold, rxvt-unicode had SIGFPE masked. A friend confirmed he could reproduce when he ran the executable using rxvt-unicode.