Question

Greg's Wiki has this very simple example of how to keep a server running s.t. if it exits, it is instantly restarted:

#!/bin/sh
while :; do
   /my/game/server -foo -bar -baz >> /var/log/mygameserver 2>&1
done

But how about where you want to keep N servers running, s.t. if one fails, all should be restarted? http://wiki.bash-hackers.org/scripting/bashchanges says bash 4.3 will let me do

while :; do
    server1 & p1=$!
    server2 & p2=$!
    wait -n $p1 $p2 # wait until at least one exits
    kill $p1 $p2
done

but 4.3 is still in alpha, is there a way to do this with older systems?

No correct solution

OTHER TIPS

Here's the method I came up with, based on Greg's Wiki and some help from #bash on irc.freenode.net:

#!/bin/bash
trap 'rm -f manager; kill 0' EXIT
mkfifo manager
declare -A pids
restart () {
    # assuming your servers/daemons are programs "a" and "b"
    [[ -n ${pids[a]} ]] && kill "${pids[a]}"
    [[ -n ${pids[b]} ]] && kill "${pids[b]}"
    run_and_tell manager a & pids[a]=$!
    run_and_tell manager b & pids[b]=$!
}
restart
while :; do
  read < manager
  restart
done

and run_and_tell:

#!/bin/bash
trap 'kill $pid' EXIT
manager=$1
prog=$2
$prog & pid=$!
wait $pid
echo >"$manager"

Not as nice as the bash 4.3 version, but it seems to work (e.g. testing with "sleep 9999" in run_and_tell). One annoyance is that I have to trap 'kill $pid' EXIT in the runner, and it seems I have to do the same in $prog, to ensure it's killed when its parent is killed.

Here's an alternative version that avoids having to trap, by putting run_and_tell in its own process group:

#!/bin/bash
# The trap now needs to kill all created process groups:
trap 'rm -f manager; kill 0; kill ${pids[a]} ${pids[b]}' EXIT
mkfifo manager
declare -A pids
restart () {
    # assuming servers/daemons are programs "a" and "b":
    [[ -n ${pids[a]} ]] && kill -TERM -"${pids[a]}"
    [[ -n ${pids[b]} ]] && kill -TERM -"${pids[b]}"
    setsid ./run_and_tell manager a & pids[a]=$!
    setsid ./run_and_tell manager b & pids[b]=$!
}
restart
while :; do
  read < manager
  restart
done

and run_and_tell becomes just:

#!/bin/bash
manager=$1
prog=$2
$prog
echo >"$manager"

The simplest way is to check them manually for every interval:

#!/bin/bash

function check_if_all_active {
    local p
    for p in "$@"; do
        kill -s 0 "$p" &>/dev/null || return 1
    done
    return 0
}

while :; do
    pids=()
    server1 & pids+=("$!")
    server2 & pids+=("$!")
    while check_if_all_active "${pids[@]}"; do
        sleep 1s  ## Can be longer.
    done
    kill -s SIGTERM "${pids[@]}" &>/dev/null
done

You can also consider other signals to stop your processes like SIGHUP or SIGABRT.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top