Locking file in distributed system

Question

For your situation you should use share-mode locks. This is exactly what they were made for.

Oplocks won't to what you want - an oplock is not a lock, and doesn't prevent anyone doing anything. It's a notification mechanism to let the client machine know if anyone accesses the file. This is communicated to the machine by "breaking" your oplock, but this is not something that makes its way to the application layer (i.e. to your code) - it just generates a message to the client operating system to tell it to invalidate it's cached copy and fetch again from the server.

See MSDN here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa365433(v=vs.85).aspx

The explanation of what happens when another process opens a file on which you hold an oplock is here:

http://msdn.microsoft.com/en-us/library/windows/desktop/aa363786(v=vs.85).aspx

However the important point is that oplocks do not prevent other processes opening the file, they just allow coordination between the client computers. Therefore, oplocks do not lock the file at the application level - they are a feature of the network protocol used by the network file system stack to implement caching. They are not really for applications to use.

Since you are programming on windows the appropriate solution seems to be Share-mode locks, i.e. opening the file with SHARE_DENY_READ|SHARE_DENY_WRITE|SHARE_DENY_DELETE.

If share-mode locks are not supported on the CIFS server, you might consider flock() type locks. (Named after a traditional Unix technique).

If you are processing xyz.xml create a file called xyz.xml.lock (with the CREATE_NEW mode so you don't clobber an existing one). Once you are done, delete it. If you fail to create the file because it already exists, that means that another process is working on it. It might be useful to write information to the lockfile which is useful in debugging, such as the servername and PID. You will also have to have some way of cleaning up abandoned lock files since that won't occur automatically.

Database locks might be appropriate if the CIFS is for example a replicated system so that the flock() lock will not occur atomically across the system. Otherwise I would stick with the filesystem as then there is only one thing to go wrong.