Is memory barrier needed in this situation or just a volatile [duplicate]

https://stackoverflow.com/questions/23359265

11-07-2023
|

Question

I'm reading this article, and I follow the author's steps but get a different result.

I create two threads. One is reader, and one is writer.

// volatile uint64_t variable1 = 0; <- global
// uint64_t* variable2_p = new uint64_t(0); <- in main function
// const unsigned ITERATIONS = 2000000000; <- global

void *reader(void *variable2) {
    volatile uint64_t *variable2_p = (uint64_t *)variable2;
    // bind this thread to CPU0

    unsigned i, failureCount = 0;
    for (i=0; i < ITERATIONS; i++) {
            uint64_t v2 = *variable2_p;
            uint64_t v1 = variable1;
            if (v2 > v1) {
                failureCount++;
                printf("v1:%" PRIu64 ", v2:%" PRIu64 "\n", v1, v2);
            }
    }
    printf("%u failure(s)", failureCount);
    return NULL;
}

void *writer(void *variable2) {
    volatile uint64_t *variable2_p = (uint64_t *)variable2;
    // bind this thread to CPU1

    for (;;) {
        variable1 = variable1 + 1;
        *variable2_p = (*variable2_p) + 1;
    }
    return NULL;
}

In the article above, the author said that the compare v2 <= v1 may fail for some time because the compiler or the processor may change the execution order.

But I tried so many times, there isn't any failure cases. I'm confused that is that right to use only volatile is this situation? Or it will lead to some delicate bugs?

If it isn't OK, please give me a example. Thanks a lot.

compile command: g++  -O2 -Wall -g -o foo foo.cc -lpthread
uname -a: Linux Wichmann 3.5.0-48-generic #72~precise1-Ubuntu SMP Tue Mar 11 20:09:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
cpuid: Intel(R) Core(TM) i5-3230M CPU @ 2.60GHz

Solution

It may fail doesn't mean that it will fail, on all machines. If you're running on a single core, for example, it probably won't fail. If you're running on a multicore Alpha, it almost certainly will fail some of the time. On other machines, the results will vary, depending on any number of things.

As for volatile, it offers no guarantees for multi-threaded code. It may be necessary if you're also using inline assembler, or other things the compiler can't understand, but otherwise: any time you do what ever else is necessary to ensure thread safety, you don't need volatile. In particular, if you use the C++11 atomic types or threading primitives, volatile is never necessary.

OTHER TIPS

Edit

Actually, in this very specific case the code is correct because both variable1 is volatile and the pointer variable2_p is marked as pointer to a volatile. This enforces ordering of memory access.

Volatile is misused so often that I jumped the gun here, sorry.

Old answer:

Using volatile only guarantees two things:

Reads and writes happen to the actual memory address where the variable resides and are not optimized away or kept in registers
Sequential accesses to volatile variables are not reordered

You asked for an example, but you have given it yourself in your question:

variable1 = variable1 + 1;
*variable2_p = (*variable2_p) + 1;

This could be reordered, leading to the failure in the other thread. That this doesn't happen in your specific environment is irrelevant. The compiler is allowed to do it so the code is not correct.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow