concurrent_vector is not working inside parallel_for ( PPL )

https://stackoverflow.com/questions/15041027

11-03-2022
|

题

there is a sample working code below ( parallel_for using Parallel Pattern Library ( ppl ) ). The main problem in here is sqr < concurrent_vector > stored values changing in every execution, but it should not be!

I used < concurrent_vector > for random access why it's not working?

#include <iostream>
#include <ppl.h>
#include <concurrent_vector.h>

using namespace std;
using namespace concurrency;

const int a = 10, b = 30;

critical_section cs;

int main() {

    concurrent_vector< int > labels( a * b );

    concurrent_vector< int > sqr( 5 );

    // filling label vector
    for ( int y = 0; y < b; y++ ) {
        for ( int x = 0; x < a; x++ ) {

            if( x<2 && y>3 )
                labels[ a * y + x ] = 1;
            else if( x<30 && y<5 )
                labels[ a * y + x ] = 2;
            else if( x>5 && y>10 )
                labels[ a * y + x ] = 3;
            else if( x>2 && y>20 )
                labels[ a * y + x ] = 4;
        }
    }

    // printing
    for ( int y = 0; y < b; y++ ) {
        for ( int x = 0; x < a; x++ ) {

            cout << labels[ a * y + x ] << ", ";
        }
        cout << endl;
    }

    parallel_for ( 0, b, [ & ]( int y ) {
        for ( int x = 0; x < a; x++ ) {

            //cs.lock();  // when i used it's working but slow
            int i = labels[ a * y + x ];
            //cs.unlock();

            if ( i < 0 ) continue;

            sqr[ i ] ++;
        }
    } );

    for( int i=0; i<5; i++ )
        cout << sqr[i] << ", ";
    cout << "" << endl;

    system ("pause");

    return 0;
}

解决方案 2

Using task_group::wait method should be faster (as you don't have to lock/unlock every time) and it may work as you expect.

This method blocks the current task until the tasks of another task group have completed their work.

See MSDN: Parallel Tasks.

Update: I have run some timing tests and seems that this is not a solution (besides both fail on large data inputs on my Dual-Core). This can be a bug of "design' in concurrent_vector" as in Intel's TBB - tbb::concurrent_vector returns wrong size

其他提示

You aren't using any features of concurrent vector which are relevant to concurrency. In fact, you could replace it by standard vector with no difference... Apparently, the values of i overlap in each execution of the kernel. There is absolutely no guarantee that concurrent writes to the same element of the vector are synchronized. Therefore you're getting random results - it's just the consequence of data races on non-atomic writes.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow