Question

I am seeing unusual signal numbers (for example 50, 80 or 117) from the following code when waiting for a child process to terminate. I am only seeing this from one particular child process, and I have no access to the process source code and it only happens some of the time.

I want to know what these unusual values mean, given NSIG == 32, and where I can find some documentation in the headers or man pages?

Note that this code runs in a loop sending progressively more menacing signals until the child terminates.

int status, signal;

if (waitpid(m_procId, &status, WNOHANG) < 0) {
    LOGERR << "Failed to wait for process " << name() << ": " <<
        strerror(errno) << " (" << errno << ")";
    break;
} else if (WIFEXITED(status)) {
    m_exitCode = WEXITSTATUS(status);
    terminated = true;
    LOGINF << "Process " << name() << " terminated with exit code " << m_exitCode;
} else if (WIFSIGNALED(status)) {
    signal = WTERMSIG(status);    // !!! signal is sometimes 50, 80 or 117 !!!
    terminated = true;
    LOGINF << "Process " << name() << " terminated by signal " << signal;
} else {
    LOGWRN << "Process " << name() << " changed state but did not terminate.  status=0x" <<
        hex << status;
}

This is running under OSX 10.8.4, but I have also seen it in 10.9 GM seed.

EDIT Modifying the code as below makes the code more robust, however sometimes the child process gets orphaned as I guess the loop doesn't do enough to kill the child process.

else if (WIFSIGNALED(status)) {
    signal = WTERMSIG(status);
    if (signal < NSIG) {
        terminated = true;
        LOGINF << "Process " << name() << " terminated by signal " << signal;
    } else {
        LOGWRN << "Process " << name() << " produced unusual signal " << signal
               << "; assuming it's not terminated";
    }
}

Note this code is part of the Process::unload() method of this class.

Was it helpful?

Solution

From the OS X manpage for waitpid, when specifing WNOHANG, you should check for a return of 0:

 When the WNOHANG option is specified and no processes wish to report status, wait4() returns a process
 id of 0.

 The waitpid() call is identical to wait4() with an rusage value of zero.  The older wait3() call is the
 same as wait4() with a pid value of -1.

The code posted does not check for this, which suggests to me that the value of status is likely junk (the value of the int is never initialized). This could cause what you are seeing.

EDIT: status is indeed only set when waitpid returns > 0.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top