Guarantees of benign race conditions in C++

Question 1

The problem with data races is not, that you can read a wrong value on a machine level. The problem with data races is, that both compiler and processor perform a lot of optimizations on the code. To make sure that these optimizations are correct in the presence of multiple threads, they need additional information about variables that can be shared between threads. Such optimizations can for example:

reorder operations
add additional load and store operations
remove load and store operations

There is a good paper benign data races by Hans Boehm called How to miscompile programs with "benign" data races. The following excerpt is taken from this paper:

Double checks for lazy initialization

This is well-known to be incorrect at the source-code level. A typical use case looks something like
if (!init_flag) {
    lock();
    if (!init_flag) {
        my_data = ...;
        init_flag = true;
    }
    unlock();
}
tmp = my_data;
Nothing prevents an optimizing compiler from either reordering the setting of my_data with that of init_flag, or even from advancing the load of my_data to before the first test of init_flag, reloading it in the conditional if init_flag was not set. Some non-x86 hardware can perform similar reorderings even if the compiler performs no transformation. Either of these can result in the final read of my_data seeing an uninitialized value and producing incorrect results.

Here is another example, where int x is a shared and int r is a local variable.

int r = x;
if (r == 0)
    printf("foo\n");
if (r != 0)
    printf("bar\n");

If we would only say, that reading x leads to an undefined value, then the program would either print "foo" or "bar". But if the compiler transform the code as follows, the program might also print both strings or none of them.

if (x == 0)
    printf("foo\n");
if (x != 0)
    printf("bar\n");

Question 2

you can use linux OS where you can fork a 2 or more child process over a parent process in c++,you can make both to access one memory location and , by using synchronization you can achieve what you wanna do.--> How to share memory between process fork()? , http://en.wikipedia.org/wiki/Dekker's_algorithm , http://en.wikipedia.org/wiki/Readers%E2%80%93writers_problem ,

Question 3

One example that will always result in a race location: ask two threads to write a different value to the same variable. Let's assume that

thread one sets variable a to 1
thread two sets variable a to 2

You will get race condition, even with a mutex for example because

if thread one is executed first then you get a=1 then a=2.
if thread two is executed first then you get a=2 then a=1.

The order of the threads is depending on the os and there is not guratantee about which thread will be first. Otherwise it would be sequential and no need to do it in separate threads.

Assume now that you have not synchronisation at all and you are doing a=a+1 in the first thread a=a+2 in the second thread. The inital value of a is 0.

In assembly the code being generated is copy the value of a into one register, add 1 to it (in the case of the first thread, 2 otherwise).

If you have no synchronization at all you can have the following order for example

Thread1: value of a copied to reg1. reg1 contains 0
Thread2: value of a copied to reg2. reg2 contains 0
Thread1: value of reg1 added 1. Now contains 1
Thread2: value of reg2 added 2. Now contains 2
Thread1: value of reg1 added 1. Now contains 1
Thread2: value of reg2 added 2. Now contains 2
Thread1: value of reg1 put to a. Now a contains 1
Thread2: value of reg2 put to a. Now a contains 2

If you have thread1 executed then sequentially thread 2 you would have a=3 at the end.

Now imagine a is a pointer, ie an adressm so as you know, getting a wrong pointer adress can cause the program to crash. So a wrong synchronization can cause the program to crash.

Makes sense?