Your problem is the pair of wait()
loops:
while (wait(&status) != child1);
while (wait(&status) != child2);
In your scenario, the second child dies before the first does, so your collect its corpse in the first loop, but ignore it. Then the second loop goes into a busy wait because there are no children left any more.
At minimum, you need to do:
int corpse;
while ((corpse = wait(&status)) != -1 && corpse != child1 && corpse != child2)
;
while ((corpse = wait(&status)) != -1 && corpse != child1 && corpse != child2)
;
This handles children dying in either order — but only the two children. For a more general pipeline (three or more processes), you have to work harder — and use a single loop. The more general form will be something like:
int corpse;
while ((corpse = wait(&status)) != -1)
{
if (record_death_of_child(corpse, status) == -1)
break;
}
where your process creation code records the PIDs of the created processes, and the record_death_of_child()
code deals with that list of PIDS and returns -1 when there are no more children to wait for in the current pipeline (and 0 otherwise). Or you can have it use some other heuristic to determine when to exit the loop. Note that if you have long running jobs in the background, any of them could die and that corpse would be collected in the loop. The 'record death' function would need to deal with such processes too — they can no longer be brought into the foreground, for example, and you need to report that they exited, etc.
You might end up using waitpid()
, too, since you can arrange for that to not hang while there's a background process that's still running using WNOHANG
.