Question

I've got a script that 1) runs often 2) is run by lots of different processes and 3) takes a long time.

update: The stuff that takes a long time is tests who's results will be the same for every process. Totally redundant.

I think it's time to do some caching, but I'm worried about the potential for races, conflicts, corruption, temporal-vortex-instability and chickens.

The complexity comes in because any of the processes could update the cache as well as read the cache, so I have to know how to handle all those combinations.

This smells to me like something that someone smarter and more educated than myself has already probably figured out.

Anyway, to make this question more concrete, here's what I've thought of so far. I'm using flock in my head, not sure if that's a good idea.

  • if the cache is fresh, read it and go away
  • if the cache is stale
    • try to get a write lock
      • if I get the lock, do the tests and update the cache
      • If I don't get the lock, does someone else have an write or a read lock?
      • If its shared, why are they reading a stale cache? Do I ignore them, do the tests and update the cache (or maybe this causes them to read a half-written cache... er...)
      • If it's exclusive, give them a short time to complete the tests and update the cache.

Hope that makes sense...

Was it helpful?

Solution

Here is a scheme which uses flock(2) for file locking in concurrent environments. It explains how "safe-cache" works.

Every cache file has two companion files (WLock and RLock). All flock requests are blocking except first one (NB WLock).

  • having WLock ensures opportunity for possible generation of fresh cache
  • having shared RLock ensures safe reading from cache file
  • and having exclusive RLock ensures safe writing to cache file

There are two companion files for only one reason, and that is when new cache is being generated, and old cache is not too old(cache time+N is not expired) clients can still use old cache instead of waiting for cache being generated.

Please comment on this scheme and make it simpler if possible.

cache locking diagram

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top