System call without context switching?

https://stackoverflow.com/questions/8878819

16-04-2021
|

Question

I was just reading up on how linux works in my OS-book when I came across this..

[...] the kernel is created as a single, monolitic binary. The main reason is to improve performance. Because all kernel code and data structures are kept in a single address space, no context switches are necessary when a process calls an operating-system function or when a hardware interrup is delivered.

That sounded quite amazing to me, surely it must store the process's context before running off into kernel mode to handle an interrupt.. But ok, I'll buy it for now. A few pages on, while describing a process's scheduling context, it said:

Both system calls and interrups that occur while the process is executing will use this stack.

"this stack" being the place where the kernel stores the process's registers and such.

Isn't this a direct contradiction to the first quote? Am I missinterpreting it somehow?

Solution

I think the first quote is referring to the differences between a monolithic kernel and a microkernel.

Linux being monolithic, all its kernel components (device drivers, scheduler, VM manager) run at ring 0. Therefore, no context switch is necessary when performing system calls and handling interrupts.

Contrast microkernels, where components like device drivers and IPC providers run in user space, outside of ring 0. Therefore, this architecture requires additional context switches when performing system calls (because the performing module might reside in user space) and handling interrupts (to relay the interrupts to the device drivers).

OTHER TIPS

"Context switch" could mean one of a couple of things, both relevant: (1) switching from user to kernel mode to process the system call, or an involuntary switch to kernel mode to process an interrupt against the interrupt stack, or (2) switching to run another user process in user space, with a jump to kernel space in between the two.

Any movement from user space to kernel space implies saving enough user-space to return to it reliably. If the kernel-space code decides that - while you're no longer running the user-code for that process - it's time to let another user-process run, it gets in.

So at the least, you're talking 2-3 stacks or places to store a "context": hardware-interrupts need a kernel-level stack to say what to return to; user method/subroutine calls use a standard stack for getting that done. Etc.

The original Unix kernels - and the model isn't that different now for this part - ran the system calls like a short-order cook processing breakfast orders: move this over on the stove to make room for the order of bacon that just arrived, start the bacon, go back to the first order. All in kernel switching context. Was not a huge monitoring application, which probably drove the IBM and DEC software folks mad.

When making a system call in Linux, a context switch is done from user-space to kernel space (ring3 to ring0). Each process has an associated kernel mode stack, that is used by the system call. Before the system call is executed, the CPU registers of the process are stored on its user-mode stack, this stack is different from the kernel mode stack, and is the one which the process uses for user-space executions.

When a process is in kernel mode (or user mode), calling functions of the same mode will not require a context switch. This is what is referred by the first quote.

The second quote refers to the kernel mode stack, and not the user-mode stack.

Having said this, I must mention Linux optimisations, where no transition is needed to the kernel space for executing a system call, i.e. all processing related to the system call is done in the user space itself (thus no context switch). vsyscall, and VDSO are such techniques. The idea behind them is quite simple. It is to send to the user space, the data that is required for execution of the corresponding system call. More info can be found in this LWN article.

In addition to this, there have been some research projects in which all the execution happens in the same ring. User space programs, and the OS code, both reside in the same ring. Idea is to get rid of the overhead of ring switches. Microsoft's [singularity][2] OS is one such project.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow