Pregunta

We have a shared folder which contains some files that needs to be processed. We also have 3 UNIX servers which runs a shell script which take and process one file each time. At the end of the script the file it's moved away. The 3 UNIX server doesn't communicate each other, and they are not aware of each other.

In your opinion what is the best way to guarantee that each file will be processed one time, without raising concurrent access issues\error ?

¿Fue útil?

Solución

So or so you need some type of a file locking mechanism. Some of the possibilities:

  • You can create a temporary lock file for every files on work. For example, for file name.ext you will need to create a name.ext.lock, just before you start its processing. If this file already exists - also, the creation fails with a "file exists", it means somebody is already working on it, thus you shouldn't do anything with it.
  • Second, you could use advisory locks. Advisory locking doesn't already work on every type of file sharing, and they have only libc-level interface, so you can't use them from shell scripting. I suggest to dig into the manual of the flock libc api call.
  • Third, it were the hardest and it is deeply unix-specific. It were the mandatory lock. Mandatory locking means that the locks are effective even against the processes, who don't know anything from them. You can read more about them here: Mandatory file lock on linux

In your place I did the first, if I can modify the workings of the processing (for example, if I can hook them with a script or even I am developing the processing script). If not, you need probably the third, although it doesn't always work.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top