I will answer myself.
There are several reasons for not to count child processes this way. Firstly signal handler could be interrupted by another signal. I can't find any information, what actually happens when this situation occurs. There are some info about it on libc manual pages and in this answer. But that may not be an issue.
It seems that operations on volatile sig_atomic_t
variable are not really atomic and it depend on system architecture. In example, on amd64 compiled code of decrementing sproc_counter
value looks like that:
movl sproc_counter(%rip), %eax
subl $1, %eax
movl %eax, sproc_counter(%rip)
As you can see, there are as much as three assembler instructions! It's definitely not atomic, so access to sproc_counter
have to be synchronized.
Okay, but why adding a mutex did not give a result? Answer is on manual page of pthread_mutex_lock()
/pthread_mutex_unlock()
:
ASYNC-SIGNAL SAFETY
The mutex functions are not async-signal safe. What this means is that
they should not be called from a signal handler. In particular, calling
pthread_mutex_lock or pthread_mutex_unlock from a signal handler may
deadlock the calling thread.
That makes it clear. What is more calling functions, which print date (log message) is also a bad idea -- used there fputs()
is not async-signal-safe.
How to do it correctly?
Having in mind what can happen during signal handling (ie. delivery of other signals), it quite clear that signal handling routine should be as terse as possible. It is much better to set a flag in handler and test it from time to time in the main program or dedicated thread. I choose the second solution.
No more words, let's see some code.
Signal handling will look that way:
void sig_chld (int signo __attribute__((__unused__)))
{
sigchld_notify = 1;
}
The main()
routine:
volatile sig_atomic_t sigchld_notify = 0; /* SIGCHLD notifier */
int sproc_counter = 0; /* forked child process counter */
pthread_mutex_t sproc_mutex = PTHREAD_MUTEX_INITIALIZER; /* mutex for child process counter */
/* S/MIME Gate main function */
int main (int argc, char **argv)
{
pthread_t guard_id;
[...]
/* start child process guard */
if (0 != pthread_create(&guard_id, NULL, child_process_guard, NULL) )
err_sys("pthread_create error");
[...]
/* SMTP Server's main loop */
for (;;) {
[...]
/* check whether child processes limit is not exceeded */
if (sproc_counter < MAXSUBPROC) {
if ( (childpid = Fork()) == 0) { /* child process */
Close(listenfd); /* close listening socket */
smime_gate_service(connfd); /* process the request */
exit(0);
}
pthread_mutex_lock(&sproc_mutex);
++sproc_counter;
pthread_mutex_unlock(&sproc_mutex);
}
else
err_msg("subprocesses limit exceeded, connection refused");
Close(connfd); /* parent closes connected socket */
}
} /* end of main() */
Guarding thread routine:
extern volatile sig_atomic_t sigchld_notify; /* SIGCHLD notifier */
extern int sproc_counter; /* forked child process counter */
extern pthread_mutex_t sproc_mutex; /* mutex for child process counter */
void* child_process_guard (void* arg __attribute__((__unused__)))
{
pid_t pid;
int stat;
for (;;) {
if (0 == sigchld_notify) {
usleep(SIGCHLD_SLEEP);
continue;
}
while ( (pid = waitpid(-1, &stat, WNOHANG)) > 0) {
pthread_mutex_lock(&sproc_mutex);
--sproc_counter;
pthread_mutex_unlock(&sproc_mutex);
err_msg("child %d terminated", pid);
}
sigchld_notify = 0;
}
return NULL;
}