Question

I'm trying to debug a multi- architecture OSS program (the Das u-boot boot loader under sandbox configuration - it produces a standard linux executable) using Eclipse CDT. I like how it presents things (wonderful GUI work!). For example it shows something like this for a thread,

Thread [1] 9480 [core:2] (Suspend:Step)

As I ran the program the "core number" would change (value between 0 & 3, when running on an Intel i3). This initially led me to believe that the debugger was showing me different processing contexts of the application (i.e. I thought it was running on all 4 processor cores). I spent a LOT of time trying to install the "multi-core gbd debugger" and configure it, but had to admit failure in the end.

When I came back to the problem a weekend later, I noticed that while the "core number" would change, the thread-id did not (Additionally, I could not locate (in source code) where the fork() (or similar) system call was being made from).

My current theory is that the program is indeed running as a single threaded application, but for reasons I do not understand, enjoys jumping between the different processor cores of my system.

My questions are as follows;

  1. Is my current theory correct?
  2. If so, can I expect this kind of behaviour for __any__ single threaded application running outside the debug environment?
  3. From an optimization perspective, there would be some amount of context switching when moving between cores, even for a single threaded application. Is there any practical benefit to the jumping around?
Was it helpful?

Solution

  1. yes
  2. yes
  3. it depends.

What's happening is that the scheduler picks the best CPU (let's define it as any one of: physical cpu, core, hyperthread) for your process to run depending on a lot of variables. Generally a scheduler will attempt to keep a process on the same CPU to avoid expensive cache and TLB misses between CPUs, but it has to make a trade-off between the cost of moving a process to a CPU it didn't previously run on and the cost of waiting for the previous CPU to become available.

Let's say that your process X was running on CPU 0. It became non-runnable for some reason (waiting for a lock or I/O or preempted because it used too much cpu and some other process needs to run). Another process Y starts running on CPU 0. For some reason your process X becomes runnable again. CPU 1 is idle. Now the scheduler can make four possible decisions:

  1. Wait for process Y to finish running, then run process X on CPU 0.
  2. Preempt process Y, run process X on CPU 0, move process Y to CPU 1.
  3. Preempt process Y, run process X on CPU 0 until it stops running, resume process Y on CPU 0.
  4. Run process X on CPU 1.

Different schedulers in different operating systems will make different decisions. Some prefer lower latency for all processes disregarding the cost of switching to a different CPU, so they'll always pick 4. Some prefer strong affinity, so they'll pick 1. In many cases the scheduler makes an educated guess about how much cache state process X has left on CPU 0 and decides that since the process was suspended for some amount of time, it probably doesn't have that much cache/TLB left on CPU 0 and it doesn't really cost that much to move it to a different CPU. Many will take into account the memory bus layout and calculate a cost of moving the process and in your case maybe the scheduler knows that it's cheap to make the move. The scheduler might also make a best effort guess of how process Y behaves and if it's likely to finish running soon it might wait for it to finish. etc.

Generally, unless you're doing something that really needs to squeeze out the last nanosecond of the performance from your application you don't need to worry about it. The scheduler will make a good enough decision and if it doesn't it will not matter that much for most applications anyway. As far as you need to know in most cases your process is being moved between CPUs somewhere between each instruction and never.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top