Transferring 1-2 megabytes of data through regular files in Windows - is it slower than through RAM?

StackOverflow https://stackoverflow.com/questions/22099843

Question

I'm passing 1-2 MB of data from one process to another, using a plain old file. Is it significantly slower than going through RAM entirely?

Before answering yes, please keep in mind that in modern Linux at least, when writing a file it is actually written to RAM, and then a daemon syncs the data to disk from time to time. So in that way, if process A writes a 1-2 MB into a file, then process B reads them within 1-2 seconds, process B would simply read the cached memory. It gets even better than that, because in Linux, there is a grace period of a few seconds before a new file is written to the hard disk, so if the file is deleted, it's not written at all to the hard disk. This makes passing data through files as fast as passing them through RAM.

Now that is Linux, is it so in Windows?

Edit: Just to lay out some assumptions:

  1. The OS is reasonably new - Windows XP or newer for desktops, Windows Server 2003 or newer for servers.
  2. The file is significantly smaller than available RAM - let's say less than 1% of available RAM.
  3. The file is read and deleted a few seconds after it has been written.
Was it helpful?

Solution

When you read or write to a file Windows will often keep some or all of the file resident in memory (in the Standby List). So that if it is needed again, it is just a soft-page fault to map it into the processes' memory space.

The algorithm for what pages of a file will be kept around (and for how long) isn't publicly documented. So the short answer is that if you are lucky some or all of it may still be in memory. You can use the SysInternals tool VMmap to see what of your file is still in memory during testing.

If you want to increase your chances of the data remaining resident, then you should use Memory Mapped Files to pass the data between the two processes.

Good reading on Windows memory management: Mysteries of Windows Memory Management Revealed

OTHER TIPS

You can use FILE_ATTRIBUTE_TEMPORARY to hint that this data is never needed on disk:

A file that is being used for temporary storage. File systems avoid writing data back to mass storage if sufficient cache memory is available, because typically, an application deletes a temporary file after the handle is closed. In that scenario, the system can entirely avoid writing the data. Otherwise, the data is written after the handle is closed.

(i.e. you need use that flag with CreateFile, and DeleteFile immediately after closing that handle).


But even if the file remains cached, you still have to copy it twice: from your process A to the cache (the WriteFile call), and from cache to the proces B (ReadFile call).

Using memory mapped files (MMF, as josh poley already suggested) has the primary advantage of avoiding one copy: the same physical memory pages are mapped into both processes.

A MMF can be backed by virtual memory, which means basically that it always stays in memory unless swapping becomes necessary.

The major downside is that you can't easily grow the memory mapping to changing demands, you are stuck with the initial size.


Whether that matters for an 1-2 MB data transfer depends mostly on how you acquire and what you do with the data, in many scenarios the additional copy doesn't really matter.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top