Embedded systems that do not use virtual memory typically run as a single process or thread or support a multi-threading rather than multi-processing task model. That is to say all threads/tasks share a common address space but have separate stacks (although also in the single address space).
On a processor that has an MMU and supports virtual memory this is done simply by not configuring the MMU or at least having a static MMU configuration with a one-to-one mapping so that physical and MMU addresses are identical, or at least so that there is a single virtual address space.
Many low to mid-range architectures used in embedded systems such as, PIC, AVR, ARM7, ARM Cortex-M, Zilog Z8 etc. lack an MMU, and typically have much smaller memory resources than a typical ARM9/11/Cortex-A or x86 based system.
For multi-threading support in an MMU-less system you would typically use a real-time operating systems (RTOS). Most RTOS, with some notable exceptions are simple task schedulers with IPC and synchronisation primitives and do not use or support an MMU. High-end RTOS such as QNX and VxWorks have MMU support, although in VxWorks it is optional.
uCLinux is a GPOS targeted at processors that have sufficient memory resources to run Linux but which lack an MMU such as ARM7 and Cortex-M. Although arguably, Linux without an MMU rather misses one of the major advantages of using Linux while lacking hard real-time performance, and requiring large memories; a typical RTOS kernel requires (much) less than 10kBytes of code.