Question

When I do mapping the file into memory by using mmap(), then it is not fully loaded directly into memory (cache buffer RAM), but only loaded parts which needed when accessing a specific area of ​​mapped memory.

How does the system catches this access through the C: *ptr = val; or asm: MOV to/from memory of memory-mapped-file and generates load data from/to a file, what mechanism is used (IRQ or something like pageable-memory), and how is it called?

  • x86_64
  • OS: Linux, Windows
Was it helpful?

Solution

The short answer is paging.

The kernel keeps track of the different virtual memory areas (aka mappings) of each process. There are file-backed and anonymous (swap-backed) mappings. On Linux, you can look at them with cat /proc/<pid>/maps.

A little overview of different stages in the lifetime of a demand-paged page:

  • The accessed memory page is present and has correct permissions → the access proceeds as normal.
  • The accessed memory page is not present or hasn't the appropriate permissions → the CPU generates a page fault, the kernel looks at the faulting address and takes a decision. After some decisions the faulting instruction is retried:

    • It corresponds to a non-present file-backed mapping: load it from disk.
    • It corresponds to a non-present non-file-backed (anonymous) mapping: swap it in.
    • It's a write to a non-writable page, and corresponds to a writable mapping: it's a COW (copy-on-write) fault, duplicate (unshare) the page, mark the page in the page-table/TLB as writable.
  • The kernel decides to write the page back to disk: it may be due to memory pressure, a msync(), or just be done periodically.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top