Question

I want to use cat filepath > /dev/null as a cheap memory caching mechanism. What I am wondering about is: if I call it a second time, and the file is already in the disk cache, is the OS smart enough to do nothing?

Update: I've tested this on a CIFS volume, using fadvise POSIX_FADV_WILLNEED to cache the file locally (using linux-ftools on command line). Turns out that the volume needs to be mounted in read-write mode for this to work. In read only mode, the fadvise seems to be ignored. This must have something to do with the samba oplock mechanism.

Was it helpful?

Solution

It is better to posix_fadvise(...,POSIX_FADV_WILLNEED) than to cat the file to /dev/null - it requires less actual IO, and doesn't need the file contents to be read into userspace ram, destroying CPU caches.

Moreover, if the relevant part of the file is already in the cache, the posix_fadvise will proabably do a lot less work than cat file > /dev/null

If you feel that you really need the pages to be in core just now, then mmap the relevant section of the file and mlock it (unlock it afterwards; it might get discarded immediately if memory pressure is tight). That needs root privileges.

In general doing this kind of thing is a pessimisation and should be avoided, however. Forcing the kernel to behave how you want may reduce its ability to optimise the actual workload just now.

OTHER TIPS

No, and it cannot.

Determining if a program will do nothing is usually more complex than just running it.

Why do you need to control the memory caching anyway ? If absolutely necessary, consider using a tmpfs filesystem or using compcache (a compressed RAM block device)

It won't do nothing, as the other answers have said. But if what you really meant was:

If I call it a second time, and the file is already in the disk cache, is the OS smart enough to not read it from disk a second time?

... then the answer is yes1. That's how the disk cache works, after all.


1. As long as the filesystem in question uses the page cache, anyway.

It will be fast as hell, but it won't be a no-op (if it were, there would be legit reasons for syscalls to do unexpected things instead of their promised functions...). However, depending on the filesystem driver used, and the kernel options, you could be running close to the memory bandwidth.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top