Question

Imagine file-based caching of some processes result on linux machine.

  • We're making the process (resource-consuming) only, when there's change in the source data.
  • With every query for the result, we're checking, if the base data are changed.
  • If the data changes, we process the data and save the cache.
  • Checking changes and cache freshness (if created after last change) is made by querying files modification time (source data & the cache file).

Tricky part: The process takes some time - and there might come change to the data, while we're processing them. Is changing modification time of the cache to its creating query time safe?

It's something like:

  1. Source data changed at 20:00:01.
  2. Query came at 20:00:05 - we're recreating cache.
  3. Finished on 20:04:15.
  4. Saving cache.
  5. Changing modification time of cache file to 20:00:05 (to show, that every change after 20:00:05 is not counted).

Is it safe? For backups, deployment, source control... What situations might create problems with such a solution?

Was it helpful?

Solution

No. The safe solution is to write the file with a temporary name into the same directory and then rename it after writing all data to it.

  1. This way, you will never have an incomplete/truncated file
  2. If you have an error, the original data will still be intact
  3. For error handling, you just need to delete the temporary file

It also solves all the problems with backup, source control: You can make them ignore the temp files.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top