Question

Our processors are allowed to reorder the instruction in order to gain some performance benefits, but this may cause some strange behaviour. I'm trying to reproduce one of this issues on the base of this article.

This is my code:

int a,b;
int r1,r2;
mutex m1,m2;

void th1()
{
    for(;;)
    {
        m1.lock();
        a=1;
        asm volatile("" ::: "memory"); 
        r1=b;
        m1.unlock();
    }
}

void th2()
{
    for(;;)
    {
        m2.lock();
        b=1;
        asm volatile("" ::: "memory"); 
        r2=a;
        m2.unlock();
    }
}

int main()
{
    int cnt{0};
    thread thread1{th1};
    thread thread2{th2};
    thread1.detach();
    thread2.detach();
    for(int i=0;i<10000;i++)
    {
        m1.lock();
        m2.lock();
        a=b=0;
        m1.unlock();
        m2.unlock();
        if(r1==0&&r2==0)
        {
            ++cnt;
        }
    }
    cout<<cnt<<" CPU reorders happened!\n";
}

I use the mutexes to be sure that the 'main' thread will not modify a nor b when th1 or th2 are doing their jobs, the output of the execution change all the time, it may be 0, it may be 10000 or a casual number between 0 and 10000.

There's something about this code that makes me little unconfortable, i'm not sure whether it really reproduce the CPU reordering phenomen.

From the code looks like the only way r1 and r2 may be 0 in the 'if' is because of th1 and th2 set them to the value from 'a' and 'b', which in the context of th1 and th2 cannot be 0 due to the lock mechanism, the only way those variables are 0 is because of the instruction reordering, this this correct?

Thanks

Was it helpful?

Solution

Your program is very different from the one in the article that you cited from preshing.com. The preshing.com program uses semaphores where yours uses mutexes.

Mutexes are simpler than semaphores. They only make one guarantee--that only one thread at a time can lock the mutex. That is to say, they can only be used for mutual exclusion.

The preshing.com program does something with its semaphores that you can't do with mutexes alone: It synchronizes the loops in the three threads so that they all proceed in lock-step. Thread1 and Thread2 each wait at the top of their loop until main() lets them go, and then main waits at the bottom of its loop until they have completed their work. Then they all go 'round again.

You can't do that with mutexes. In your program, what prevents main from going around its loop thousands of times before either of the other two threads gets to run at all? Nothing but chance. Nor does anything prevent Thread1 and/or Thread2 from looping thousands of times while main() is blocked, waiting for its next time slice.

Remember, a semaphore is a counter. Look carefully at how the semaphores in the preshing.com are incremented and decremented by the threads, and you will see how it keeps the threads synchronized.

OTHER TIPS

I made the mistake of using the mutexes instead of the semaphores (thanks james large), this is the properly working code:

#include <mutex>
#include <condition_variable>
using namespace std;

class semaphore{
private:
    mutex mtx;
    condition_variable cv;
    int cnt;

public:
    semaphore(int count = 0):cnt(count){}
    void notify()
    {
        unique_lock<mutex> lck(mtx);
        ++cnt;
        cv.notify_one();
    }
    void wait()
    {
        unique_lock<mutex> lck(mtx);

        while(cnt == 0){
            cv.wait(lck);
        }
        --cnt;
    }
};

int a,b;
int r1,r2;
semaphore s1,s2,s3;

void th1()
{
    for(;;)
    {
        s1.wait();
        a=1;
        asm volatile("" ::: "memory");
        r1=b;
        s3.notify();
    }
}

void th2()
{
    for(;;)
    {
        s2.wait();
        b=1;
        asm volatile("" ::: "memory");
        r2=a;
        s3.notify();
    }
}

int main()
{
    int cnt{0};
    thread thread1{th1};
    thread thread2{th2};
    thread1.detach();
    thread2.detach();
    for(int i=0;i<100000;i++)
    {
        a=b=0;
        s1.notify();
        s2.notify();
        s3.wait();
        s3.wait();

        if(r1==0&&r2==0)
        {
            ++cnt;
        }
    }
    cout<<cnt<<" CPU reorders happened!\n";
}

The reordering seems to be properly reproduced.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top