The problem is the range-based for loop.
If we take a look at how the range-based for statement is defined, we see that the end-iterator of the loop will only be calculated once. At the time of entering the loop, there is probably(this is a race) only one future in your vector(the one you pushed back in the line above). Thus after that task finishes, the iterator will be incremented and be equal to your old end-iterator and the loop will finish even though the vector might now contain more elements which got pushed back in your first task. There are even more problems to this.
The destructor of the vector which will be called after finishing the loop should normally call the destructor of all its elements which for a future from std::async
would be equal to calling wait, though you are still adding elements to the vector while it's already in its destructor, which is probably UB.
Another point is that the end-iterator you created on entering the for-loop will be invalidated as soon as you push_back to your vector in your first thread, this means that you are operating on invalidated iterators.
As a solution I would propose to avoid the global task-list and instead use a local task-list in your searchFiles
function, you can then wait on all your local futures in your searchFiles
function on each level. This is a common pattern in non-managed recursive parallelism.
Note: I don't know all the details from the ppl concurrent_vector but I assume it behaves similar to a std::vector
.