Question

I have a list of strings that I need to pass to a process in a different class. What I want to know which of the two ideas would be a better approach to use in terms of speed, efficiency and parallel processing. The list contains +- 10000 strings and I want to limit the threads appropriately to have only about 5threads running at one time:

For i as integer = 0 to searchPages.Count - 1
    Parallel.For(0,10,Sub(x)
                        ps.processPage(searchPages.Item(i))
                 End Sub)
Next

The task factory seems to work fine but not sure which to implement.

For i as integer = 0 to searchPages.Count - 1
    Dim fact as Task=Task.Factory.StartNew(Sub() ps.processPage(searchPages.Item(i)))
    If i = 11 then
           Tasks.Task.WaitAll()
    Endif

Next

Any ideas appreciated.

Était-ce utile?

La solution

For this type of pure data parallelism, I would recommend using Parallel.ForEach:

Parallel.ForEach(searchPages, Sub(page) ps.processPage(page))

If you want to restrict this to use 5 threads, you can do that via ParallelOptions.MaxDegreeOfParallelism:

Dim po as New ParallelOptions
po.MaxDegreeOfParallelism = 5
Parallel.ForEach(searchPages, po, Sub(page) ps.processPage(page))

This will have less overhead than Task.Factory.StartNew, since the partitioning within the Parallel class will reuse Tasks, and prevent over scheduling from occurring. It will also use the current thread for some of the processing instead of forcing it into a wait state, which also will reduce the total overhead involved.

Autres conseils

If I were you I wouldn't worry too much about how many threads are being used (unless you can show it's a problem). Just use a Parallel.ForEach and let the runtime work out the optimum number of threads.

Take a look at the answers to this question for some good details on how the threads are managed by the runtime for you.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top