Question

I have a bunch of threads that are doing lots of communication with each other. I would prefer this be lock free.

For each thread, I want to have a mailbox, where other threads can send it messages, (but only the owner can remove messages). This is a multiple-producer single-consumer situation. is it possible for me to do this in a lockfree / high performance matter? (This is in the inner loop of a gigantic simulation.)

Was it helpful?

Solution

Sure, if you have an atomic CompareAndSwap instruction:

for (i = 0; ; i = (i + 1) % MAILBOX_SIZE)
{
    if ((mailbox[i].owned == false) &&
        (CompareAndSwap(&mailbox[i].owned, true, false) == false))
        break;
}

mailbox[i].message = message;
mailbox[i].ready = true;

After reading a message, the consuming thread just sets mailbox[i].ready = false; mailbox[i].owned = false; (in that order).

OTHER TIPS

Lock-free Multiple Producer Single Consumer (MPSC) Queue is one of the easiest lock-free algorithms to implement.

The most basic implementation requires a simple lock-free singly-linked list (SList) with only push() and flush(). The functions are available in the Windows API as InterlockedFlushSList() and InterlockedPushEntrySList() but these are very easy to roll on your own.

Multiple Producer push() items onto the SList using a CAS (interlocked compare-and-swap).

The Single Consumer does a flush() which swaps the head of the SList with a NULL using an XCHG (interlocked exchange). The Consumer then has a list of items in the reverse-order.

To process the items in order, you must simply reverse the list returned from flush() before processing it. If you do not care about order, you can simply walk the list immediately to process it.

Two notes if you roll your own functions:

1) If you are on a system with weak memory ordering (i.e. PowerPC), you need to put a "release memory barrier" at the beginning of the push() function and an "aquire memory barrier" at the end of the flush() function.

2) You can make the functions considerably simplified and optimized because the ABA-issue with SLists occur during the pop() function. You can not have ABA-issues with a SList if you use only push() and flush(). This means you can implement it as a single pointer very similar to the non-lockfree code and there is no need for an ABA-prevention sequence counter.

Here's a paper from the University of Rochester illustrating a non-blocking concurrent queue. The algorithm described in the paper shows one technique for making a lockless queue.

may want to look at Intel thread building blocks, I recall being to lecture by Intel developer that mentioned something along those lines.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top