What is the best way to ensure only one instance of a Bash script is running? [duplicate]

https://stackoverflow.com/questions/1715137

19-09-2019
|

Question

This question already has an answer here:

Quick-and-dirty way to ensure only one instance of a shell script is running at a time 39 answers

What is the simplest/best way to ensure only one instance of a given script is running - assuming it's Bash on Linux?

At the moment I'm doing:

ps -C script.name.sh > /dev/null 2>&1 || ./script.name.sh

but it has several issues:

it puts the check outside of script
it doesn't let me run the same script from separate accounts - which I would like sometimes.
-C checks only first 14 characters of process name

Of course, I can write my own pidfile handling, but I sense that there should be a simple way to do it.

Solution

If the script is the same across all users, you can use a lockfile approach. If you acquire the lock, proceed else show a message and exit.

As an example:

[Terminal #1] $ lockfile -r 0 /tmp/the.lock
[Terminal #1] $ 

[Terminal #2] $ lockfile -r 0 /tmp/the.lock
[Terminal #2] lockfile: Sorry, giving up on "/tmp/the.lock"

[Terminal #1] $ rm -f /tmp/the.lock
[Terminal #1] $ 

[Terminal #2] $ lockfile -r 0 /tmp/the.lock
[Terminal #2] $

After /tmp/the.lock has been acquired your script will be the only one with access to execution. When you are done, just remove the lock. In script form this might look like:

#!/bin/bash

lockfile -r 0 /tmp/the.lock || exit 1

# Do stuff here

rm -f /tmp/the.lock

OTHER TIPS

Advisory locking has been used for ages and it can be used in bash scripts. I prefer simple flock (from util-linux[-ng]) over lockfile (from procmail). And always remember about a trap on exit (sigspec == EXIT or 0, trapping specific signals is superfluous) in those scripts.

In 2009 I released my lockable script boilerplate (originally available at my wiki page, nowadays available as gist). Transforming that into one-instance-per-user is trivial. Using it you can also easily write scripts for other scenarios requiring some locking or synchronization.

Here is the mentioned boilerplate for your convenience.

#!/bin/bash
# SPDX-License-Identifier: MIT

## Copyright (C) 2009 Przemyslaw Pawelczyk <przemoc@gmail.com>
##
## This script is licensed under the terms of the MIT license.
## https://opensource.org/licenses/MIT
#
# Lockable script boilerplate

### HEADER ###

LOCKFILE="/var/lock/`basename $0`"
LOCKFD=99

# PRIVATE
_lock()             { flock -$1 $LOCKFD; }
_no_more_locking()  { _lock u; _lock xn && rm -f $LOCKFILE; }
_prepare_locking()  { eval "exec $LOCKFD>\"$LOCKFILE\""; trap _no_more_locking EXIT; }

# ON START
_prepare_locking

# PUBLIC
exlock_now()        { _lock xn; }  # obtain an exclusive lock immediately or fail
exlock()            { _lock x; }   # obtain an exclusive lock
shlock()            { _lock s; }   # obtain a shared lock
unlock()            { _lock u; }   # drop a lock

### BEGIN OF SCRIPT ###

# Simplest example is avoiding running multiple instances of script.
exlock_now || exit 1

# Remember! Lock file is removed when one of the scripts exits and it is
#           the only script holding the lock or lock is not acquired at all.

I think flock is probably the easiest (and most memorable) variant. I use it in a cron job to auto-encode dvds and cds

# try to run a command, but fail immediately if it's already running
flock -n /var/lock/myjob.lock   my_bash_command

Use -w for timeouts or leave out options to wait until the lock is released. Finally, the man page shows a nice example for multiple commands:

   (
     flock -n 9 || exit 1
     # ... commands executed under lock ...
   ) 9>/var/lock/mylockfile

Use the `set -o noclobber` option and attempt to overwrite a common file.

A short example

if ! (set -o noclobber ; echo > /tmp/global.lock) ; then
    exit 1  # the global.lock already exists
fi

# ... remainder of script ...

A longer example

This example will wait for the global.lock file but timeout after too long.

 function lockfile_waithold()
 {
    declare -ir time_beg=$(date '+%s')
    declare -ir time_max=7140  # 7140 s = 1 hour 59 min.

    # poll for lock file up to ${time_max}s
    # put debugging info in lock file in case of issues ...
    while ! \
       (set -o noclobber ; \
        echo -e "DATE:$(date)\nUSER:$(whoami)\nPID:$$" > /tmp/global.lock \ 
       ) 2>/dev/null
    do
        if [ $(($(date '+%s') - ${time_beg})) -gt ${time_max} ] ; then
            echo "Error: waited too long for lock file /tmp/global.lock" 1>&2
            return 1
        fi
        sleep 1
    done

    return 0
 }

 function lockfile_release()
 {
    rm -f /tmp/global.lock
 }

 if ! lockfile_waithold ; then
      exit 1
 fi
 trap lockfile_release EXIT

 # ... remainder of script ...

(This is similar to this post by @Barry Kelly which was noticed afterward.)

I'm not sure there's any one-line robust solution, so you might end up rolling your own.

Lockfiles are imperfect, but less so than using 'ps | grep | grep -v' pipelines.

Having said that, you might consider keeping the process control separate from your script - have a start script. Or, at least factor it out to functions held in a separate file, so you might in the caller script have:

. my_script_control.ksh

# Function exits if cannot start due to lockfile or prior running instance.
my_start_me_up lockfile_name;
trap "rm -f $lockfile_name; exit" 0 2 3 15

in each script that needs the control logic. The trap ensures that the lockfile gets removed when the caller exits, so you don't have to code this on each exit point in the script.

Using a separate control script means that you can sanity check for edge cases: remove stale log files, verify that the lockfile is associated correctly with a currently running instance of the script, give an option to kill the running process, and so on. It also means you've got a better chance of using grep on ps output successfully. A ps-grep can be used to verify that a lockfile has a running process associated with it. Perhaps you could name your lockfiles in some way to include information about the process: user, pid, etc., which can be used by a later script invocation to decide whether the process that created the lockfile is still around.

first test example

[[ $(lsof -t $0| wc -l) > 1 ]] && echo "At least one of $0 is running"

second test example

currsh=$0
currpid=$$
runpid=$(lsof -t $currsh| paste -s -d " ")
if [[ $runpid == $currpid ]]
then
  sleep 11111111111111111
else
  echo -e "\nPID($runpid)($currpid) ::: At least one of \"$currsh\" is running !!!\n"
  false
  exit 1
fi

explanation

"lsof -t" to list all pids of current running scripts named "$0".

Command "lsof" will do two advantages.

Ignore pids which is editing by editor such as vim, because vim edit its mapping file such as ".file.swp".
Ignore pids forked by current running shell scripts, which most "grep" derivative command can't achieve it. Use "pstree -pH pidnum" command to see details about current process forking status.

i found this in procmail package dependencies:

apt install liblockfile-bin

To run: dotlockfile -l file.lock

file.lock will be created.

To unlock: dotlockfile -u file.lock

Use this to list this package files / command: dpkg-query -L liblockfile-bin

Ubuntu/Debian distros have the start-stop-daemon tool which is for the same purpose you describe. See also /etc/init.d/skeleton to see how it is used in writing start/stop scripts.

-- Noah

I'd also recommend looking at chpst (part of runit):

chpst -L /tmp/your-lockfile.loc ./script.name.sh

One line ultimate solution:

[ "$(pgrep -fn $0)" -ne "$(pgrep -fo $0)" ] && echo "At least 2 copies of $0 are running"

I had the same problem, and came up with a template that uses lockfile, a pid file that holds the process id number, and a kill -0 $(cat $pid_file) check to make aborted scripts not stop the next run. This creates a foobar-$USERID folder in /tmp where the lockfile and pid file lives.

You can still call the script and do other things, as long as you keep those actions in alertRunningPS.

#!/bin/bash

user_id_num=$(id -u)
pid_file="/tmp/foobar-$user_id_num/foobar-$user_id_num.pid"
lock_file="/tmp/foobar-$user_id_num/running.lock"
ps_id=$$

function alertRunningPS () {
    local PID=$(cat "$pid_file" 2> /dev/null)
    echo "Lockfile present. ps id file: $PID"
    echo "Checking if process is actually running or something left over from crash..."
    if kill -0 $PID 2> /dev/null; then
        echo "Already running, exiting"
        exit 1
    else
        echo "Not running, removing lock and continuing"
        rm -f "$lock_file"
        lockfile -r 0 "$lock_file"
    fi
}

echo "Hello, checking some stuff before locking stuff"

# Lock further operations to one process
mkdir -p /tmp/foobar-$user_id_num
lockfile -r 0 "$lock_file" || alertRunningPS

# Do stuff here
echo -n $ps_id > "$pid_file"
echo "Running stuff in ONE ps"

sleep 30s

rm -f "$lock_file"
rm -f "$pid_file"
exit 0

I found a pretty simple way to handle "one copy of script per system". It doesn't allow me to run multiple copies of the script from many accounts though (on standard Linux that is).

Solution:

At the beginning of script, I gave:

pidof -s -o '%PPID' -x $( basename $0 ) > /dev/null 2>&1 && exit

Apparently pidof works great in a way that:

it doesn't have limit on program name like ps -C ...
it doesn't require me to do grep -v grep ( or anything similar )

And it doesn't rely on lockfiles, which for me is a big win, because relaying on them means you have to add handling of stale lockfiles - which is not really complicated, but if it can be avoided - why not?

As for checking with "one copy of script per running user", i wrote this, but I'm not overly happy with it:

(
    pidof -s -o '%PPID' -x $( basename $0 ) | tr ' ' '\n'
    ps xo pid= | tr -cd '[0-9\n]'
) | sort | uniq -d

and then I check its output - if it's empty - there are no copies of the script from same user.

from with your script:

ps -ef | grep $0 | grep $(whoami)

Here's our standard bit. It can recover from the script somehow dying without cleaning up it's lockfile.

It writes the process ID to the lock file if it runs normally. If it finds a lock file when it starts running, it will read the process ID from the lock file and check if that process exists. If the process does not exist it will remove the stale lock file and carry on. And only if the lock file exists AND the process is still running will it exit. And it writes a message when it exits.

# lock to ensure we don't get two copies of the same job
script_name="myscript.sh"
lock="/var/run/${script_name}.pid"
if [[ -e "${lock}" ]]; then
    pid=$(cat ${lock})
    if [[ -e /proc/${pid} ]]; then
        echo "${script_name}: Process ${pid} is still running, exiting."
        exit 1
    else
        # Clean up previous lock file
        rm -f ${lock}
   fi
fi
trap "rm -f ${lock}; exit $?" INT TERM EXIT
# write $$ (PID) to the lock file
echo "$$" > ${lock}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow