Pergunta

I have this sample code.

List<Dictionary<string,string>> objects = new List<Dictionary<string,string>>();

foreach (string url in urls)
{
    objects.add(processUrl(url))
}

I need to process the URL, processUrl down load the page and run many regex to extract some informations and return a "C# JSON like" object, so I want to run this in parallels and in the end I need a list of objects so i need to wait all tasks to continue process, how can I accomplish this? I se many example but none saving the return.

Regards

Foi útil?

Solução

Like this?

var results = urls.AsParallel().Select(processUrl).ToList();

With Parallel:

Parallel.ForEach(
    urls, 
    url =>
    {
        var result = processUrl(url);
        lock (syncOjbect)
            objects.Add(result);
    };

or

var objects = new ConcurrentBag<Dictionary<string,string>>();
Parallel.ForEach(urls, url => objects.Add(processUrl(url)));
var result = objects.ToList();

or with Tasks:

var tasks = urls
    .Select(url => Task.Factory.StartNew(() => processUrl(url)))
    .ToArray();

Task.WaitAll(tasks);
var restuls = tasks.Select(arg => arg.Result).ToList();

Outras dicas

First, refactor as

processUrl(url, objects);

and make the task responsible for adding the results to the list.

Then add locking so two parallel tasks don't try to use the results list at exactly the same time.


Note: async support in the next version of .NET will make this trivially easy.

You can use PLinq extensions, this requires the .NET 4.0

System.Threading.Tasks.Parallel
          .ForEach(urls, url => {
             var result = processUrl(url);
             lock(objects)
             {
                  objects.Add(result);
             }
           });
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top