Question

I got a large list to loop through (1.500.000 items), with each item I have to do a very small check. Totally during 30 seconds.

The CPU utilization when using Sequential is around 10%, so there's a lot of resources not used.

The first thought was to use Parallel, but due to the limited time duration for each item, Parallel lasts longer than a sequential Foreach, this is due to "Why was the parallel version slower than the sequential version in this example?", which explains that the creation of each task will cost time.

So I had another thought and that is to divide the list in 4 (or more) equal peaces and create a thread to loop through the items to get it faster.

Before creating my own class, is this a good approach? Or any other thoughts of how to speed things up? Or do you know a better way of handling this.

Code

The code I created for another parallel approach: (used in my own static class)

public static void ForEach<T>(IEnumerable<T> list, Action<T> body, int listDevide)
{
    // Number of items
    int items = list.Count();
    // Divided (in int, so floored)
    int listPart = items / listDevide;
    // Get numbers extra for last run
    int rest = items % listDevide;

    // List to save the actions
    var actions = new List<Action>();
    for(var x = 0; x < listDevide; x++)
    {
        // Create the actions
        actions.Add(delegate {
            foreach(var item in list.Skip(x * listPart).Take(listPart))
            {
                body.Invoke(item);
            }
        });
    }

    // Run the actions parallel
    Parallel.Invoke(actions.ToArray());
}

Remark: "rest" variable to do the last items is currently not used in this example.

Solution below, more info: http://msdn.microsoft.com/en-us/library/dd997411.aspx

Was it helpful?

Solution

Yes, partitioning the input array is a good approach.

In fact, Microsoft provide a Partitioner class to help with exactly this approach.

Here's an example showing how to do it:

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;

namespace Demo
{
    class Program
    {
        private void run()
        {
            double sum = 0;
            Func<double, double> func = x => Math.Sqrt(Math.Sin(x));
            object locker = new object();

            double[] data = testData();

            // For each double in data[] we are going to calculate Math.Sqrt(Math.Sin(x)) and
            // add all the results together.
            //
            // To do this, we use class Partitioner to split the input array into just a few partitions,
            // (the Partitioner will use knowledge about the number of processor cores to optimize this)
            // and then add up all the values using a separate thread for each partition.
            //
            // We use threadLocalState to compute the total for each partition, and then we have to
            // add all these together to get the final sum. We must lock the additon because it isn't
            // threadsafe, and several threads could be doing it at the same time.

            Parallel.ForEach
            (
                Partitioner.Create(0, data.Length),

                () => 0.0,

                (subRange, loopState, threadLocalState) =>
                {
                    for (int i = subRange.Item1; i < subRange.Item2; i++)
                    {
                        threadLocalState += func(data[i]);
                    }

                    return threadLocalState;
                },

                finalThreadLocalState =>
                {
                    lock (locker)
                    {
                        sum += finalThreadLocalState;
                    }
                }
            );

            Console.WriteLine("Sum = " + sum);
        }

        private static double[] testData()
        {
            double[] array = new double[1000003]; // Test with an odd number of values.

            Random rng = new Random(12345);

            for (int i = 0; i < array.Length; ++i)
                array[i] = rng.Next() & 3; // Don't want large values for this simple test.

            return array;
        }

        static void Main()
        {
            new Program().run();
        }
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top