Question

I'm trying to write some code to determine if clock_gettime used with CLOCK_MONOTONIC_RAW will give me results coming from the same hardware on different cores.

From what I understand it is possible for each core to produce independent results but not always. I was given the task of obtaining timings on all cores with a precision of 40 nanoseconds.

The reason I'm not using CLOCK_REALTIME is that my program absolutely must not be affected by NTP adjustments.

Edit:

I have found the unsynchronized_tsc function which tries to test whether the TSC is the same on all cores. I am now attempting to find if CLOCK_MONOTONIC_RAW is based on the TSC.

Final edit:

It turns out that CLOCK_MONOTONIC_RAW is always usable on multi-core systems and does not rely on the TSC even on Intel machines.

Was it helpful?

Solution

To do measurements this precisely; you'd need:

  • code that's executed on all CPUs, that reads the CPU's time stamp counter and stores it as soon as "an event" occurs
  • some way to create "an event" that is noticed at the same time by all CPUs
  • some way to prevent timing problems caused by IRQs, task switches, etc.

Various possibilities for the event include:

  • polling a memory location in a loop, where one CPU writes a new value and other CPUs stop polling when they see the new value
  • using the local APIC to broadcast an IPI (inter-processor interrupt) to all CPUs

For both of these methods there are delays between the CPUs (especially for larger NUMA systems) - a write to memory (cache) may be visible on the CPU that made the write immediately, and be visible by a CPU on a different physical chip (in a different NUMA domain) later. To avoid this you may need to find the average of initiating the event on all CPUs. E.g. (for 2 CPUs) one CPU initiates and both measure, then the other CPU initiates and both measure, then results are combined to cancel out any "event propagation latency".

To fix other timing problems (IRQs, task switches, etc) I'd want to be doing these tests during boot where nothing else can mess things up. Otherwise you either need to prevent the problems (ensure all CPUs are running at the same speed, disable IRQs, disable thread switches, stop any PCI device bus mastering, etc) or cope with problems (e.g. run the same test many times and see if you get similar results most of the time).

Also note that all of the above can only ensure that the time stamp counters were in sync at the time the test was done, and don't guarantee that they won't become out of sync after the test is done. To ensure the CPUs remain in sync you'd need to rely on the CPU's "monotonic clock" guarantees (but older CPUs don't make that guarantee).

Finally; if you're attempting to do this in user-space (and not in kernel code); then my advice is to design code in a way that isn't so fragile to begin with. Even if the TSCs on different CPUs are guaranteed to be perfectly in sync at all times, you can't prevent an IRQ from interrupting immediately before or immediately after reading the TSC (and there's no way to atomically do something and read TSC at the same time); and therefore if your code requires such precisely synchronised timing then your code's design is probably flawed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top