Question

I've written a low lock list in C++ on windows 32-bit. I'm getting BIG improvements over using critical sections but I'd like someone to sanity check that what I'm doing is correct and there aren't any errors in what I've done:

#ifndef __LOW_LOCK_STACK_H_
#define __LOW_LOCK_STACK_H_

template< class T > class LowLockStack
{
protected:
    struct Entry
    {
        Entry*  pNext;
        T*              pData;
    };
    union Header
    {
        __int64         m_XChg;
        struct
        {
                Entry*  m_pNext;
                __int16 m_Depth;
                __int16 m_Counter;
        };
    };

    Header      m_Header;
public:
    LowLockStack()
    {
        m_Header.m_pNext        = NULL;
        m_Header.m_Depth        = 0;
        m_Header.m_Counter  = 0;
    }

    ~LowLockStack()
    {
    }

    void PushEntry( T* pData )
    {
        Entry* pEntry   = new Entry;
        pEntry->pData   = pData;

        Header header;
        Header xchg;
        do
        {
            xchg.m_XChg   = m_Header.m_XChg;

            header.m_pNext  = pEntry;
            header.m_Depth  = xchg.m_Depth + 1;
            header.m_Counter = xchg.m_Counter + 1;
            pEntry->pNext  = xchg.m_pNext;
        } while( _InterlockedCompareExchange64( &m_Header.m_XChg, header.m_XChg, xchg.m_XChg ) != xchg.m_XChg );
    }

    T* PopEntry()
    {
        Entry* pEntry   = NULL;
        Header header;
        Header xchg;
        do
        {
            xchg.m_XChg    = m_Header.m_XChg;

            pEntry                  = xchg.m_pNext;
            if ( pEntry == NULL )
            {
                 return NULL;
            }

            header.m_pNext  = pEntry->pNext;
            header.m_Depth  = xchg.m_Depth - 1;

        } while( _InterlockedCompareExchange64( &m_Header.m_XChg, header.m_XChg, xchg.m_XChg ) != xchg.m_XChg );

        T* pRet = pEntry->pData;
        delete pEntry;

        return pRet;
    }

    __int32 GetDepth()
    {
        return m_Header.m_Depth;
    }
};

#endif

If there are no errors (which i doubt ;)) then think of it as a reference implementation :D

Edit: I've updated the code taking into account a number of criticisms.

Was it helpful?

Solution

You do not synchronize access to the list header member. This is bad at least on 2 levels:

  • assigning values to the list header can be not as atomic as you think. Which means that an unsynchronized read operation can potentially get a corrupted value.

  • another, more probable issue with this is that if your box has multiple cores, every one of them can have in the processor cache its own copy of the value. To make them synchronize the values you need a memory barrier

OTHER TIPS

As you discovered, lock free programming is very difficult to get right.

Windows already has support for lock-free singly linked lists,http://msdn.microsoft.com/en-us/library/ms684121(VS.85).aspx, you should try using that rather than roll your own.

Consider what would happen in the following sequence of events when there are two items, A and B, in the list (stack really) like head -> A -> B, count is 2:

  1. Thread 1 starts the pop() call but gets preempted right before _InterlockedCompareExchange64()
  2. Thread 2 removes both items, A and B, from the stack and then puts two new items back onto the stack and the top item happens to be allocated at the same address as A, so we have, say head -> A -> D. Note that the count is back to 2
  3. Thread 1 resumes and successfully performs the CAS ( _InterlockedCompareExchange64()). Now head points to B, which was deallocated (bad) and D is lost (bad).

This is a classic ABA problem. You should be using the second word as a generation number instead of item count, i.e. do not ever decrement it.

There's a mailing list discussion going on right now about experimental boost::lockfree library.

Also take a look at Herb Sutter's lock-free queue - this is a different approach where a dummy node keeps producer and consumer from stepping on each other.

The most obvious error is the name you've given it. Regardless of the fact that you've implemented it as a linked list, what you've implemented is a stack.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top