Question

I have implemented a file locking mechanism along the lines of the suggestion from the linux man page for "open", which states:

Portable programs that want to perform atomic file locking using a lockfile, and need to avoid reliance on NFS support for O_EXCL, can create a unique file on the same file system (e.g., incorporating hostname and PID), and use link(2) to make a link to the lockfile. If link(2) returns 0, the lock is successful. Otherwise, use stat(2) on the unique file to check if its link count has increased to 2, in which case the lock is also successful.

This seems to work perfectly, however to get 100% code coverage in my testing, I need to cover the case where the link count is increased to 2.

I've tried googling, but all I seem to be able to find is the same reference above regurgitated as "the way it's done".

Can anybody explain to me what set of circumstances would cause the link to fail (returns -1), but the link count is increased to 2?

Was it helpful?

Solution

The answer to your question is provided at the bottom of the link(2) page of the Linux Programmer's Manual:

   On NFS file systems, the return code may  be  wrong  in  case  the  NFS
   server  performs  the link creation and dies before it can say so.  Use
   stat(2) to find out if the link got created.

OTHER TIPS

Creating another file is more trouble than anything. Create a directory instead and check the result of the creation. The Unix manual states that only one task can succeed in creating a directory, the other will get a failure if the directory already exists, including the case where 2 task tried it at the same time. The OS itself handles the issue so you don't have to.

If it wasn't for possible stale locks, that is all you would have to do. However, things happen, programs abort and do not always remove their lock. So the implementation can be a little bit more elaborate.

In a script I have often used the code below. It handles stale locks automatically. You can implement the same in C. Check man page:

man -s 2 mkdir

EXECUTION_CONTROL_FILE: is a name PATH and Dir name, something like /usr/tmp/myAppName

second_of_now: return the current time in seconds (included below)

LOCK_MAX_TIME: is how long in seconds a lock can exists before it is considered stale

sleep 5: It is always assumed that a lock will do something short and sweet. If not, maybe your sleep cycle should be longer.

LockFile() {
  L_DIR=${EXECUTION_CONTROL_FILE}.lock
  L_DIR2=${EXECUTION_CONTROL_FILE}.lock2
  (
  L_STATUS=1
  L_FILE_COUNT=2
  L_COUNT=10
  while [ $L_STATUS != 0 ]; do
    mkdir $L_DIR 2>/dev/null
    L_STATUS=$?
    if [ $L_STATUS = 0 ]; then
      # Create the timetime stamp file
      second_of_now >$L_DIR/timestamp
    else
      # The directory exists, check how long it has been there
      L_NOW=`second_of_now`
      L_THEN=`cat $L_DIR/timestamp 2>/dev/null`
      # The file does not exist, how many times did this happen?
      if [ "$L_THEN" = "" ]; then
        if [ $L_FILE_COUNT != 0 ]; then
          L_THEN=$L_NOW
          L_FILE_COUNT=`expr $L_FILE_COUNT - 1`
        else
          L_THEN=0
        fi
      fi
      if [ `expr $L_NOW - $L_THEN` -gt $LOCK_MAX_TIME ]; then
        # We will try 10 times to unlock, but the 10th time
        # we will force the unlock.
        UnlockFile $L_COUNT
        L_COUNT=`expr $L_COUNT - 1`
      else
        L_COUNT=10  # Reset this back in case it has gone down
        sleep 5
      fi
    fi
  done
  )
  L_STATUS=$?
  return $L_STATUS
}

####
#### Remove access lock
####
UnlockFile() {
  U_DIR=${EXECUTION_CONTROL_FILE}.lock
  U_DIR2=${EXECUTION_CONTROL_FILE}.lock2
  (
  # This 'cd' fixes an issue with UNIX which sometimes report this error:
  #    rm: cannot determine if this is an ancestor of the current working directory
  cd `dirname "${EXECUTION_CONTROL_FILE}"`

  mkdir $U_DIR2 2>/dev/null
  U_STATUS=$?
  if [ $U_STATUS != 0 ]; then
    if [ "$1" != "0" ]; then
      return
    fi
  fi

  trap "rm -rf $U_DIR2" 0

  # The directory exists, check how long it has been there
  # in case it has just been added again
  U_NOW=`second_of_now`
  U_THEN=`cat $U_DIR/timestamp 2>/dev/null`
  # The file does not exist then we assume it is obsolete
  if [ "$U_THEN" = "" ]; then
    U_THEN=0
  fi
  if [ `expr $U_NOW - $U_THEN` -gt $LOCK_MAX_TIME -o "$1" = "mine" ]; then
    # Remove lock directory as it is still too old
    rm -rf $U_DIR
  fi

  # Remove this short lock directory
  rm -rf $U_DIR2
  )
  U_STATUS=$?
  return $U_STATUS
}

####
second_of_now() {
  second_of_day `date "+%y%m%d%H%M%S"`
}

####
#### Return which second of the date/time this is. The parameters must
#### be in the form "yymmddHHMMSS", no centuries for the year and
#### years before 2000 are not supported.
second_of_day() {
  year=`printf "$1\n"|cut -c1-2`
  year=`expr $year + 0`
  month=`printf "$1\n"|cut -c3-4`
  day=`printf "$1\n"|cut -c5-6`
  day=`expr $day - 1`
  hour=`printf "$1\n"|cut -c7-8`
  min=`printf "$1\n"|cut -c9-10`
  sec=`printf "$1\n"|cut -c11-12`
  sec=`expr $min \* 60 + $sec`
  sec=`expr $hour \* 3600 + $sec`
  sec=`expr $day \* 86400 + $sec`
  if [ `expr 20$year % 4` = 0 ]; then
    bisex=29
  else
    bisex=28
  fi
  mm=1
  while [ $mm -lt $month ]; do
    case $mm in
      4|6|9|11) days=30 ;;
      2) days=$bisex ;;
      *) days=31 ;;
    esac
    sec=`expr $days \* 86400 + $sec`
    mm=`expr $mm + 1`
  done
  year=`expr $year + 2000`
  while [ $year -gt 2000 ]; do
    year=`expr $year - 1`
    if [ `expr $year % 4` = 0 ]; then
      sec=`expr 31622400 + $sec`
    else
      sec=`expr 31536000 + $sec`
    fi
  done
  printf "$sec\n"
}

Use like this:

    # Make sure that 2 operations don't happen at the same time
    LockFile
    # Make sure we get rid of our lock if we exit unexpectedly
    trap "UnlockFile mine" 0
.
.  Do what you have to do
.
    # We need to remove the lock
    UnlockFile mine
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top