Question

I have a big array, iterating and doing my work over it takes about 50ms. App i am developing will run on tegra3 or other fast cpu. I have divided my work for four threads, using pthread, i have taken width of my array, divided it by total core count found in system, and i am iterating for 1/fourth of array in each thread, everything is ok, but it now need 80ms to do the work. Any idea why multithread approach is slower than single thread? If i lower cpu count to 1 everything is back on 50ms.

for(int y = 0; y<height;y++)
{
    for(int x = 0; x<width; x++)
    {
        int index = (y*width)+x;
        int sourceIndex = source->getIndex(vertex_points[index].position[0]/ww, vertex_points[index].position[1]/hh);
        vertex_points[index].position[0]+=source->x[sourceIndex]*ww;
        vertex_points[index].position[1]+=source->y[sourceIndex]*hh;
    }   

};

i am dividing first for loop of above code into four parts based on cpu count. vertex_points is a vector with positions.

so it looks like

for(int y=start;y<end;y++)

and start/end vary on each thread

Was it helpful?

Solution

Thread startup time is typically on the order of milliseconds - that's what's eating your time.

With that in mind, 50 ms is not the kind of delay I'd worry about. If we were talking 5 seconds, that'd be a good candidate for paralellizing.

If the loop needs to be performed often, consider a solution with threads that are spun up early on and kept dormant, waiting for work to do. That'll run faster.

Also, is the CPU really 4-core? Honest cores or hyperthreading?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top