it is possible to run without virtual memory at all, just physical memory (in fact, most embedded systems run this way). How?

Question 1

Virtual memory is just a way of presenting physical memory so that each process has a separate memory space. This indirection is made possible by a special hardware unit called an MMU (Memory Management Unit).

Early computer systems just used the physical memory directly. This led to security issues where one user could access the process memory of all other users on the same system. Virutal memory addresses this problem by making the memory space of each process separate.

Question 2

Embedded systems that do not use virtual memory typically run as a single process or thread or support a multi-threading rather than multi-processing task model. That is to say all threads/tasks share a common address space but have separate stacks (although also in the single address space).

On a processor that has an MMU and supports virtual memory this is done simply by not configuring the MMU or at least having a static MMU configuration with a one-to-one mapping so that physical and MMU addresses are identical, or at least so that there is a single virtual address space.

Many low to mid-range architectures used in embedded systems such as, PIC, AVR, ARM7, ARM Cortex-M, Zilog Z8 etc. lack an MMU, and typically have much smaller memory resources than a typical ARM9/11/Cortex-A or x86 based system.

For multi-threading support in an MMU-less system you would typically use a real-time operating systems (RTOS). Most RTOS, with some notable exceptions are simple task schedulers with IPC and synchronisation primitives and do not use or support an MMU. High-end RTOS such as QNX and VxWorks have MMU support, although in VxWorks it is optional.

uCLinux is a GPOS targeted at processors that have sufficient memory resources to run Linux but which lack an MMU such as ARM7 and Cortex-M. Although arguably, Linux without an MMU rather misses one of the major advantages of using Linux while lacking hard real-time performance, and requiring large memories; a typical RTOS kernel requires (much) less than 10kBytes of code.

Question 3

A world where, "It is possible to run without virtual memory at all, just physical memory."

Hey, I grew up in that world!

The concept of virtual memory has been around since the 50s, but personal computers didn't support virtual memory until the early 90s. Back in the day, PCs were single process - when you were done with your word processing app, you would quit it and load your spreadsheet app.

A modern embedded system that for example controls your washing machine or the engine in your car is a single process device - virtual memory and the MMU that provides it are a needless cost in terms of watts, silicon and development effort.

That said, it is perfectly possible to run multiple applications in a single address space. You can either make sure your compilers spit out relocatable code (i.e all jumps to local functions are relative as are references to global data) in which case, each app can be loaded where ever the OS sees fit and will function normally (i.e. Linux shared objects) OR you can represent applications in files in such a way such that they can be relocated on load. I.e when loaded to an arbitrary base address the OS corrects the address references during load (i.e. Windows DLLs and EXEs)

Question 4

Before virtual memory, and if you were interested in having different applications running in the same address space. Then during a task switch you would simply copy one applications data from one address space to another for storage, copy the other task from its storage to that address space and let it run for a while. Why you would have the desire to have the applications run in the same place on a system without a virtual memory system is beyond understanding, but if you really felt you had to you could. It is quite obvious that operating systems predate virtual memory systems, so it is a simple matter of studying those operating systems to understand how they work.

It reminds me of a question someone asked about how did we deal with the 64K boundary in the x86 world back in the day. The answer is we didnt make programs that big or need data that big, in general. So it wasnt really a problem. Sure today there are some applications that deal with terrabytes worth of data that cant be addressed in one space (practially) and we deal with those, but most of is spend most of our time living within the memory spaces available. We did back then and before that we didnt worry about virtual memory that didnt exist and the features that came with it.