質問

I am looking for a way to optimize the following code, for an open source project that I develop, or make it more performant by moving the heavy work to another thread.

void ProfilerCommunication::AddVisitPoint(ULONG uniqueId)
{
    CScopedLock<CMutex> lock(m_mutexResults);
    m_pVisitPoints->points[m_pVisitPoints->count].UniqueId = uniqueId;
    if (++m_pVisitPoints->count == VP_BUFFER_SIZE)
    {
        SendVisitPoints();
        m_pVisitPoints->count=0;
    } 
}

The above code is used by the OpenCover profiler (an open source code coverage tool for .NET written in C++) when each visit point is called. The mutex is used to protect some shared memory (a 64K block shared between several processes 32/64 bit and C++/C#) when full it signals the host process. Obviously this is quite heavy for each instrumentation point and I'd like to make the impact lighter.

I am thinking of using a queue which is pushed to by the above method and a thread to pop the data and populate the shared memory.

Q. Is there a thread-safe queue in C++ (Windows STL) that I can use - or a lock-less queue as I wouldn't want to replace one issue with another? Do people consider my approach sensible?


EDIT 1: I have just found concurrent_queue.h in the include folder - could this be my answer...?

役に立ちましたか?

解決

Okay I'll add my own answer - concurrent_queue works very well

using the details described in this MSDN article I implemented concurrent queue (and tasks and my first C++ lambda expression :) ) I didn't spend long thinking though as it is a spike.

inline void AddVisitPoint(ULONG uniqueId) { m_queue.push(uniqueId); }

...
// somewhere else in code

m_tasks.run([this]
{
    ULONG id;
    while(true)
    {
         while (!m_queue.try_pop(id)) 
            Concurrency::Context::Yield();

        if (id==0) break; // 0 is an unused number so is used to close the thread/task
        CScopedLock<CMutex> lock(m_mutexResults);
        m_pVisitPoints->points[m_pVisitPoints->count].UniqueId = id;
        if (++m_pVisitPoints->count == VP_BUFFER_SIZE)
        {
            SendVisitPoints();
            m_pVisitPoints->count=0;
        }
    }
});

Results:

  • Application without instrumentation = 9.3
  • Application with old instrumentation handler = 38.6
  • Application with new instrumentation handler = 16.2

他のヒント

Here it mentions not all container operations are thread safe on Windows. Only a limited number of methods. And I don't believe C++ standards mention about threadsafe containers. I maybe wrong, but checked the standards nothing came up

Would it be possible to offload the client's communication into a separate thread? Then the inspection points can use thread local storage to record their hits and only need to communicate with a local thread to pass off a reference when full. The communication thread can then take its time to pass on the data to the actual collector since it's not on the hot path anymore.

You could use a lock free queue. Herb Sutter has some articles here.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top