Question

I have this sample code.

List<Dictionary<string,string>> objects = new List<Dictionary<string,string>>();

foreach (string url in urls)
{
    objects.add(processUrl(url))
}

I need to process the URL, processUrl down load the page and run many regex to extract some informations and return a "C# JSON like" object, so I want to run this in parallels and in the end I need a list of objects so i need to wait all tasks to continue process, how can I accomplish this? I se many example but none saving the return.

Regards

Was it helpful?

Solution

Like this?

var results = urls.AsParallel().Select(processUrl).ToList();

With Parallel:

Parallel.ForEach(
    urls, 
    url =>
    {
        var result = processUrl(url);
        lock (syncOjbect)
            objects.Add(result);
    };

or

var objects = new ConcurrentBag<Dictionary<string,string>>();
Parallel.ForEach(urls, url => objects.Add(processUrl(url)));
var result = objects.ToList();

or with Tasks:

var tasks = urls
    .Select(url => Task.Factory.StartNew(() => processUrl(url)))
    .ToArray();

Task.WaitAll(tasks);
var restuls = tasks.Select(arg => arg.Result).ToList();

OTHER TIPS

First, refactor as

processUrl(url, objects);

and make the task responsible for adding the results to the list.

Then add locking so two parallel tasks don't try to use the results list at exactly the same time.


Note: async support in the next version of .NET will make this trivially easy.

You can use PLinq extensions, this requires the .NET 4.0

System.Threading.Tasks.Parallel
          .ForEach(urls, url => {
             var result = processUrl(url);
             lock(objects)
             {
                  objects.Add(result);
             }
           });
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top