Why is call stack not implemented as a dynamic array in modern OS? [duplicate]

https://softwareengineering.stackexchange.com/questions/335065

31-12-2020
|

Question

When a modern OS loads a process, it pre-allocates a certain amount of space for stack. This means that the programmer has to be careful to avoid stack overflow by limiting call depth and/or by increasing the stack size in run-time.

At first glance, this seems unnecessarily complicated. Why doesn't the OS implement the call stack by using a dynamic array? As the initial (modest) size of the array is exceeded, the array could be doubled in size and reallocated to a new location on the heap - ensuring an amortized O(1) cost of push. That way, the programmer won't have to worry about the stack size (well, no more than he needs to worry about memory allocated on the heap).

In the above argument I assumed a real-mode OS; but in reality all modern OS have virtual memory. This seems to only strengthen the argument for dynamic-sized stack: there's essentially unlimited virtual memory available to a process, so why not allocate, say, a quarter of process space as a stack? As the process needs more memory, it will hit the new pages in the virtual space, which will automatically result in mapping to physical memory.

Solution

Default thread stack sizes on modern desktop OSs are usually in the MB range. They'll make use of the virtual memory paging as you describe to only use physical memory as needed.

The main reason for not making this dynamically growable is because it's rarely needed. Practically, most times that a thread blows the stack limit is some kind of recursion problem. In that case you'll need it to stop growing at some point so you'll need an upper limit. You may feel that the limit should be much larger but that's not the view of most OS implementors.

Or to put in another way, the implementors pick a size that's big enough in most cases but isn't so large that it cause too many issues when there's a problem.

BTW a comment in another answer mentioned golang. It does implement something like you describe for goroutines (very lightweight threads). The reason is, I think, instructive.

golang is designed to run with thousands of goroutines running simultaneously. As such, it runs with a much smaller memory allocation per thread than you'd see for an OS thread. I believe 4k is the default. Now this is very small and you wouldn't want to increase all the goroutines if you had a small number that needed a larger stack. Hence the dynamism.

Of course, this is very different from your typical OS thread model usage hence the different approach taken.

OTHER TIPS

Programs compiled to machine code with todays compilers assume the stack is contiguous (and does not move around during execution), so this would break backwards compatibility with almost all existing software.

We are talking processor level, not class library level. It is like asking why bricks do not have elevators in modern buildings.

The call stack is primitive, not more than bits in a register in the processor that make up an address pointing to a memory location. Arrays have no meaning here, it is not an object. It would not be so great if it were either because it is used so frequently that the slightest overhead (like checking for bounds) would hurt performance enormously.

The processor has hardware that monitors the bounds. If the stack pointer goes out of its permitted range, some dedicated circuit kicks in and redirects program execution to another position where the trap handler is. This is not your regular programming logic though, you will not find any C# that checks if current > Max.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange