سؤال

I read in a vector as in:

int readBytes(string filename, vector<uint32_t> &v)
{
    // fstat file, get filesize, etc.

    uint32_t *filebuf = (uint32_t*)mmap(0,filesize,PROT_READ,
                                        MAP_FILE|MAP_PRIVATE,
                                        fhand,0);
    v = std::vector<uint32_t>(filebuf,filebuf+numrecords);
    munmap(filebuf, filesize);
}

in main() I have two successive calls (purely as a test):

vector<uint32_t> v(10000);    
readBytes(filename, v);
readBytes(filename, v);
// ...

The second call almost always gives a faster clock time:

Profile time [1st call]: 0.000214141 sec
Profile time [2nd call]: 0.000094109 sec

A look at the system calls indicates the memory chunks are differend:

mmap(NULL, 40000, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fe843ac8000
mmap(NULL, 40000, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7fe843ac7000

Why is the second call faster? Coincidence? What, if anything, is cached?

هل كانت مفيدة؟

المحلول

Assuming you're talking about something *NIX-ish, there's probably a page cache, whose job is precisely to cache this sort of data to get this speedup. Unless something else came along between calls to evict those pages from the cache, they'll still be there.

So, the first call potentially has to:

  1. allocate pages
  2. map the pages into your process address space
  3. copy the data from those pages into your vector (possibly faulting the data from disk as it goes)

the second call probably finds the pages still in the cache, and only has to:

  1. map the pages into your process address space
  2. copy the data from those pages into your vector (they're pre-faulted this time, so it's a simple memory operation)

In fact, I've skipped a step: the open/fstat step in your comment is probably also accelerated, via the inode cache.

نصائح أخرى

Remember that your program sees virtual memory. There is a mapping table ("page tables") that maps virtual addresses seen by your program to the real physical memory. And the OS will ensure that the two mmap() calls map two different virtual addresses seen by your program to the same physical memory. So the data only has to be loaded from disk once.

More detal:

  • First mmap(): OS just records the mapping
  • When you actually try to read the data: A "page fault" happens, since the data isn't in memory. The OS catches that, reads data from disk to its disk cache, and updates the page tables so that your program can read directly from that disk cache, then it resumes your program automatically.
  • First munmap(): OS disables the mapping, and updates your page tables so you can't read the file any more. Note that the file is still in the OS's disk cache.
  • Second mmap(): OS just records the mapping
  • When you actually try to read the data: A "page fault" happens, since the data isn't mapped. The OS catches that, notices that the data is already in its disk cache, and updates the page tables so that your program can read directly from that disk cache, then it resumes your program automatically.
  • Second munmap(): OS disables the mapping, and updates your page tables so you can't read the file any more. Note that the file is still in the OS's disk cache.
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top