Question

So when you open an PE (.exe) or call CreateProcess (from Win32 API) the following procedure is followed:

  1. The file header, image sectors and also the DLL's which the exe links against are mapped into the Process Own Virtual Memory.

  2. CPU begin execution at the program start address.

So here comes my question - all the instructions in the PE image use an address relative to it's own Private Address Space (Virtual Memory), which begins with 0. Also sometimes this memory is paged out by Windows somewhere in the Secondary Memory (HDD). How the CPU find out the real physical address in the RAM? Also how the Windows switch from one thread to another by it's priority, to support multi-threading and when the CPU is not fully used send Idle instructions? After all this discoveries I'm starting to think that actually the machine code, stored in the PE files, isn't really executed directly by the CPU but instead in some Windows managed environment? Can this be true, and if so doesn't this slow-down the execution?

EDIT: Ok so the question should be rewritten as follows: "Are the Windows Processes executed in an core layout program or directly on the CPU?". I get the answer I wanted, so anyway the question is solved.

Was it helpful?

Solution

A complete answer would fill an entire book, but in short:

  1. From a high-level view, finding the physical address is done by dividing the address by some constant (typically 4096), converting the address to its corresponding "page", and looking up that page in a table, which points to the index of the real, physical memory page, if one exists. Some or all of that may be done automatically by the CPU without anyone noticing, depending on the situation.
    If a page does not exist, the OS will have to read the page from disk prior to letting the code that tried to access the page continue -- and not necessarily always into the same physical page.
    In reality it's much more complex, as the table is really an entire hierarchy of tables, and in addition there is a small cache (typically around 50 entries) inside the CPU to do this task automatically for recently accessed pages, without firing an interrupt and running special kernel code.
    So, depending on the situation, things might happen fully automatically and invisibly, or the OS kernel may be called, traversing an entire hierarchy of tables, and finally resorting to loading data from disk (and I haven't even considered that pages may have protections that prevent them from being accessed, or protections that will cause them being copied when written to, etc. etc.).
  2. Multi-threading is "relatively simple" in comparison. It's done by having a timer periodically fire an interrupt every so and so often (under Windwos typically around 16 milliseconds, but this can be adjusted), and running some code (the "scheduler") inside the interrupt handler which decides whether to return to the current thread or change to another thread's context and run that one instead.
    In the particular case of Windows, the scheduler will always satisfy highest priority tasks first, and only consider lower priority tasks when no non-blocked higher priority tasks are left.
    If no other tasks are running, the idle task (which has the lowest priority) runs. The idle task may perform tasks such as zeroing reclaimed memory pages "for free", or it may throttle down the CPU (or both).
    Further, when a thread blocks (e.g. when reading a file or a socket), the scheduler runs even without a timer interrupt. This ensures that the CPU can be used for something useful during the time the blocked thread can't do anything.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top