Question

I was wondering how this could be achieved in the most efficient way.

Should I use

a.RemoveAll(x => b.AsParallel().Any(y => y == x));

or

a.AsParallel().Except(b.AsParallel());

or something else?

Can anyone explain what the underlying difference is? It seems to me, from measuring, that the second line is slower. What is the reason for this?

Was it helpful?

Solution

Using the second option, with two ParallelQuery<T> operations, will perform the entire operation in parallel:

var results = a.AsParallel().Except(b.AsParallel());

The first option does a sequential check for the removal, and must build the ParallelQuery<T> for each iteration, which will likely be far slower.

Depending on the number of elements, however, it may actually be faster to run this without AsParallel:

var results = a.Except(b);

In many cases, the overhead of parallelizing for smaller collections outweighs the gains. The only way to know, in this case, would be to profile and measure the options involved.

It seems to me, from measuring, that the second line is slower. What is the reason for this?

This may be due to a lot of factors. First, make sure you're running outside of the VS host in a release build (this is a common issue). Otherwise, this may be due to the size of the collections, and data types involved.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top