When you compute the time difference: (end.tv_nsec-begin.tv_nsec)/n
You are only taking into account the nanoseconds part of the elapsed time. You must also take the seconds into account since the tv_nsec
field only reflects the fractional part of a second:
int64_t end_ns = ((int64_t)end.tv_sec * 1000000000) + end.tv_nsec;
int64_t begin_ns = ((int64_t)begin.tv_sec * 1000000000) + begin.tv_nsec;
int64_t elapsed_ns = end_ns - begin_ns;
Actually, with your current code you should sometimes get negative results when the nanoseconds part of end
has wrapped around and is less than begin
's nanoseconds part.
Fix that, and you'll be able to observe much more consistent results.
Edit: for the sake of completeness, here's the code I used for my tests, which gets me very consistent results (between 280 and 300ns per call, whatever number of iterations I use):
int main() {
const int loops = 100000000;
struct timespec begin;
struct timespec end;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &begin);
for(int i = 0; i < loops; i++)
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end);
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end);
int64_t end_ns = ((int64_t)end.tv_sec * 1000000000) + end.tv_nsec;
int64_t begin_ns = ((int64_t)begin.tv_sec * 1000000000) + begin.tv_nsec;
int64_t elapsed_ns = end_ns - begin_ns;
int64_t ns_per_call = elapsed_ns / loops;
std::cout << ns_per_call << std::endl;
}