What happens in terms of call stacks when 10000 setTimeouts are called?

https://softwareengineering.stackexchange.com/questions/422240

22-03-2021
|

Question

Say we call 10000 setTimeouts, each with a random time, each with a few nested timeouts too. What happens in terms of the 1 call stack, or are there multiple call stacks? How does that work?

So for example:

let i = 0
while (i++ < 10000) {
  process(`process ${i}`, randomBetween(5, 30))
}

function process(name, timeout) {
  console.log(`${name} called`)
  setTimeout(() => {
    console.log(`${name} timeout called`)
    setTimeout(() => {
      console.log(`${name} nested timeout 1 called`)
    }, 25)
    setTimeout(() => {
      console.log(`${name} nested timeout 2 called`)
    }, 30)
    setTimeout(() => {
      console.log(`${name} nested timeout 3 called`)
    }, 15)
    setTimeout(() => {
      console.log(`${name} nested timeout 4 called`)
    }, 5)
  }, timeout)
}

function randomBetween(min, max) {
  return Math.floor(Math.random() * (max - min + 1) + min)
}

I get how callbacks, event loops, and async work. But I don't understand the data structure used to implement the callstack(s) in this case. What would it look like, how would it work?

I am asking because I want to implement this feature in a custom programming language, and am having difficulty imagining what the structure is of the callstack(s) under the hood.

Solution

As Phillip's answer said, all the answers were basically up to how you design your execution model, there are various tradeoffs between different execution model. That said, if you're modeling your execution model similar to JavaScript, there are a couple things we do know with JavaScript.

JavaScript uses an execution model that boils down to single threaded operation with an event queue. At the heart of this is the event queue, which is implemented using a priority queue with scheduled time as the priority/sorting key and an infinite loop that pops items off the priority queue called the event loop (sometimes also called trampoline). A priority queue is just like regular queue, except you don't just push items to the back of the queue, but you insert the item in their position as determined by the sorting key. Efficient implementation of priority queue usually uses either a heap data structure, a tree, or skip list to maintain partial sorting without having to actually constantly re-sort the queue as items gets pushed and popped into the queue very frequently.

The event loop is basically just an infinite loop that peeks at the scheduled execution time of the item at the front of the queue and optionally sleeps until the scheduled time, when it returns from sleeping, it pops the front of the queue and executes its callback, this repeats forever until the loop is closed. Any of the callback can push item to the queue, either to be executed "immediately" (i.e. it's pushed with scheduled time that immediately expires, so it'll run after all the other events that has also expired), or with future scheduled time such as when using setTimeout()/setInterval() with non-zero time.

Additionally, there are also hardware events like mouse clicks, typing into the keyboard, network events, and filesystem events, that may also schedule items into the queue asynchronously; in this case, the event queue may have to handle an interrupted sleep, respond to the hardware event, and then reschedule another sleep. Also, in some languages, the execution model may also have a mechanism to specify the priority of events, so higher priority tasks always gets precedence over lower priority tasks that expires at the same time.

The call stack is implemented, unsurprisingly, using a stack data structure, containing local variable contexts and any other information needed to resume when returning from a function call. There's the simple stack data structure that you might have been taught at CompSci classes, but you probably are also hardware stack that allows your CPU to return from a function call opcode. Depending on the design of your language, you may want to use either a hardware or software stack.

The basic principle between hardware and software stack are basically the same, you push and pop item to the top of the stack. Whenever you call a function, you push a new local variable contexts to the stack and the return address, and whenever you return from a function you pop off the top of the stack and jump into the return address. The only main difference is that with hardware stack you have to use the low level CPU instructions which will manage the call stack implicitly and automatically.

JavaScript execution model is single threaded, which means that a single JavaScript execution context is allocated a single real OS thread. Recent version of JavaScript also has async functions/coroutines which mimicks the execution of a thread, so the language can track multiple threads of execution without using real OS thread and the overhead associated with real context switching. This is also called green threading.

At the most basic level, coroutines just means that you're switching the current call stack's base pointer without actually doing a full thread context switching (which involves saving state of the CPU registers and setting up a bunch of hardware timer interrupts to preempt the thread if it didn't yield the execution).

Without coroutines, a single OS thread usually have a single hardware stack, and languages that uses single threaded execution model uses the event loop to trampoline the concurrent execution between multiple callbacks. With green threads/coroutines, that trampoline will also switch the call stacks.

OTHER TIPS

For your custom programming language, the answer is "whatever you want it to be". Once you've defined your execution model, that will tell you what your call stacks look like. In particular here, you will need to decide whether your language has pre-emptive or cooperative concurrency. Unless you've already had those thoughts, you're jumping the gun by asking questions about a JavaScript-like setTimeout function, as that's a concept which makes sense only in certain execution models.

That all said, if your execution model is something like JavaScript's "single threaded with event loop" model, a simple implementation would be something like:

setTimeout is called 10k times. This will add 10k entries to a set¹ of pending callbacks somewhere in the language runtime somewhere of the form (runAt: <some time>, functionToRun: <someFunction>).
Once the event loop is idle, the runtime will look at the set of pending callbacks created in step 1, pick one of them² for which currentTime >= runAt and run it. This may add some more entries to the set of pending callbacks.
Once the chosen callback has run to completion, repeat step 2.

At some point, your program will run out of work to do and can then either terminate or wait for an external trigger which schedules more work. Again, a decision for your execution model.

So in a JavaScript-like execution model, there is only ever one call stack, with the rest of the state being stored in data structures. But again: you can do something completely different in your programming language if you want.

^{Notes:

1. A set in the mathematical sense of "a group of things" rather than the programming sense.

2. If multiple callbacks are eligible to be run, it's again up to your execution model which one is run.}

Further continuing that thought ...

Timer queues might be built using some kind of tree data structure, with the key value being the time-of-day that the event is supposed to run. (Trees simplify the process of inserting new values at any position needed.) The timer-runner examines the lowest-valued node of the tree to determine if the event should occur. If so, the node is removed from the tree, the handler routine is called, and the node is discarded.

The timer-runner might be a thread, allowing it to sleep() until the next event is to occur. (This thread must also be awakened if anyone makes any modification to the tree.) Each time it wakes up, it looks again to find what is the next timeout to be run. Each active timeout is simply a node in the tree. A mutex is used to ensure that the tree contents are checked or modified by only one process/thread at a time.

Unsurprisingly, "timer runners" are by now a "stock software part," readily available off-the-shelf at places like github.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange