Many detached boost threads segfault

https://stackoverflow.com/questions/20717079

20-09-2022
|

Question

I'm creating boost threads inside a function with

while(trueNonceQueue.empty() && block.nNonce < std::numeric_limits<uint64_t>::max()){
    if ( block.nNonce % 100000 == 0 )
    {
        cout << block.nNonce << endl;
    }
    boost::thread t(CheckNonce, block);
    t.detach();
    block.nNonce++;
}
uint64 trueNonce;
while (trueNonceQueue.pop(trueNonce))
        block.nNonce = trueNonce;

trueNonceQueue was created with boost::lockfree::queue<uint64> trueNonceQueue(128); in the global scope.

This is the function being threaded

void CheckNonce(CBlock block){
    if(block.CheckBlockSilently()){
        while (!trueNonceQueue.push(block.nNonce))
            ;
    }
}

I noticed that after it crashed, my swap had grown marginally which never happens unless if I use poor technique like this after leaking memory; otherwise, my memory usage stays frequently below 2 gigs. I'm running cinnamon on ubuntu desktop with chrome and a few other small programs open. I was not using the computer at the time this was running.

The segfault occurred after the 949900000th iteration. How can this be corrected?

CheckNonce execution time

I added the same modulus to CheckNonce to see if there was any lag. So far, there is none.

I will update if the detached threads start to lag behind the spawning while.

Solution

You should use a Thread Pool instead. This means spawning just enough threads to get work done without undue contention (for example you might spawn something like N-2 threads on an N-core machine, but perhaps more if some work may block on I/O).

There is not exactly a thread pool in Boost, but there are the parts you need to build one. See here for some ideas: boost::threadpool::pool vs.boost::thread_group

Or you can use a more ready-made solution like this (though it is a bit dated and perhaps unmaintained, not sure): http://threadpool.sourceforge.net/

Then the idea is to spawn the N threads, and then in your loop for each task, just "post" the task to the thread pool, where the next available worker thread will pick it up.

By doing this, you will avoid many problems, such as running out of thread stack space, avoiding inefficient resource contention (look up the "thundering herd problem"), and you will be able to easily tune the aggressiveness with which you use multiple cores on any system.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow