Remove Interlocked.Add in Parallel.For?
-
26-10-2019 - |
Question
I have some code to do some lookups and count the occurance using parallel.for:
//...initialize _table with int values...
int elements=60;
int[] outerCounter=new int[elements];
Parallel.For(1, 2000, i0=>
{
int[] counter=new int[elements];
int nextPos0=_table[10+i0];
for(i1=i0+1; i1<1990; i1++){
//...here are also some additionale calculations done...
int nextPos1=_table[nextPos0+i1];
counter[nextPos1]++;
}
//synchronize
for(int i=0; i<elements;i++){
Interlocked.Add(ref outerCounter[i], counter[i]);
}
}
This version is way faster then a sequential calculation. But i would like to find a different solution to count the occurance as the Interocked.Add is a bottleneck. I was investigating if Plinq would be an option but wasn't so far able to find a way to count the occurance of the nextPos1 elements in an array.
Solution
I'd basically suggest the same thing as Hans, but I thought it would be useful to provide some code. Here's how I'd probably tackle the problem:
//...initialize _table with int values...
int elements=60;
List<int[]> outerCounter=new List<int[]>();
Parallel.For(1, 2000, i0=>
{
int[] counter;
lock(outerCounter)
{
if (outerCounter.Count == 0)
counter = new int[elements];
else
{
counter = outerCounter[outerCounter.Count - 1];
outerCounter.RemoveAt(outerCounter.Count - 1);
}
}
int nextPos0=_table[10+i0];
for(i1=i0+1; i1<1990; i1++){
//...here are also some additionale calculations done...
int nextPos1=_table[nextPos0+i1];
counter[nextPos1]++;
}
lock (outerCounter)
outerCounter.Add(counter);
});
int totalCounter = new int[elements];
Parallel.For(0, elements - 1, i =>
{
foreach (int[] counter in outerCounter)
totalCounter[i] += counter[i];
});
OTHER TIPS
From what I get from the code you aren't going to be able to do this correctly without locking outcounter[i] since all threads are going to write to all the values in outcounter.
Bit late to the party here, but if you're only incrementing the values in counter[] and outerCounter[], you can use an overloaded version of Parallel.For()
Instead of creating a local array of elements each loop, you can create one local to the execution (and will be only operated on by one thread at a time)
For example:
int elements=60;
int[] outerCounter=new int[elements];
Parallel.For (1, 2000,
() => new int[elements], // Initialize the local value.
(i0, state, counter) =>
{
int nextPos0=_table[10+i0];
for(i1=i0+1; i1<1990; i1++)
{
//...here are also some additionale calculations done...
int nextPos1=_table[nextPos0+i1];
counter[nextPos1]++;
}
}
counter => // Add the local value
{
for(int i=0; i<elements;i++)
{
Interlocked.Add(ref outerCounter[i], counter[i]);
}
}
);
I haven't tested the above code, but that's the jist of it. It will drastically reduce the amount of times you're calling Interlocked.Add()
For more information, this site is very good: http://www.albahari.com/threading/part5.aspx