Question

Im trying get the elapsed time of my program. Actually i thought I should use yclock() from time.h. But it stays zero in all phases of the program although I'm adding 10^5 numbers(there must be some CPU time consumed). I already searched this problem and it seems like, people running Linux are having this issue only. I'm running Ubuntu 12.04LTS.

I'm going to compare AVX and SSE instructions, so using time_t is not really an option. Any hints?

Here is the code:

 //Dimension of Arrays
unsigned int N = 100000;
//Fill two arrays with random numbers
unsigned  int a[N];
clock_t start_of_programm = clock();
for(int i=0;i<N;i++){
    a[i] = i;
}
clock_t after_init_of_a = clock();
unsigned  int b[N];
for(int i=0;i<N;i++){
    b[i] = i;
}
clock_t after_init_of_b = clock();

//Add the two arrays with Standard
unsigned int out[N];
for(int i = 0; i < N; ++i)
    out[i] = a[i] + b[i];
clock_t after_add = clock();

cout  << "start_of_programm " << start_of_programm  << endl; // prints
cout  << "after_init_of_a " << after_init_of_a  << endl; // prints
cout  << "after_init_of_b " << after_init_of_b  << endl; // prints
cout  << "after_add " << after_add  << endl; // prints
cout  << endl << "CLOCKS_PER_SEC " << CLOCKS_PER_SEC << endl;

And the output of the console. I also used printf() with %d, with no difference.

start_of_programm 0
after_init_of_a 0
after_init_of_b 0
after_add 0

CLOCKS_PER_SEC 1000000
Was it helpful?

Solution 2

The simplest way to get the time is to just use a stub function from OpenMP. This will work on MSVC, GCC, and ICC. With MSVC you don't even need to enable OpenMP. With ICC you can link just the stubs if you like -openmp-stubs. With GCC you have to use -fopenmp.

#include <omp.h>

double dtime;
dtime = omp_get_wtime();
foo();
dtime = omp_get_wtime() - dtime;
printf("time %f\n", dtime);

OTHER TIPS

clock does indeed return the CPU time used, but the granularity is in the order of 10Hz. So if your code doesn't take more than 100ms, you will get zero. And unless it's significantly longer than 100ms, you won't get a very accurate value, because it your error margin will be around 100ms.

So, increasing N or using a different method to measure time would be your choices. std::chrono will most likely produce a more accurate timing (but it will measure "wall-time", not CPU-time).

timespec t1, t2; 
clock_gettime(CLOCK_REALTIME, &t1); 
... do stuff ... 
clock_gettime(CLOCK_REALTIME, &t2); 
double t = timespec_diff(t2, t1);

double timespec_diff(timespec t2, timespec t1)
{
    double d1 = t1.tv_sec + t1.tv_nsec / 1000000000.0;
    double d2 = t2.tv_sec + t2.tv_nsec / 1000000000.0;

    return d2 - d1;
}

First, compiler is very likely to optimize your code. Check your compiler's optimization option.

Since array including out[], a[], b[] are not used by the successive code, and no value from out[], a[], b[] would be output, the compiler is to optimize code block as follows like never execute at all:

for(int i=0;i<=N;i++){
    a[i] = i;
}

for(int i=0;i<=N;i++){
    b[i] = i;
}

for(int i = 0; i < N; ++i)
    out[i] = a[i] + b[i];

Since clock() function returns CPU time, the above code consume almost no time after optimization.

And one more thing, set N a bigger value. 100000 is too small for a performance test, nowadays computer runs very fast with o(n) code at 100000 scale.

unsigned int N = 10000000;

Add this to the end of the code

int sum = 0;
for(int i = 0; i<N; i++)
    sum += out[i];
cout << sum;

Then you will see the times.

Since you dont use a[], b[], out[] it ignores corresponding for loops. This is because of optimization of the compiler.

Also, to see the exact time it takes use debug mode instead of release, then you will be able to see the time it takes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top