Understanding atomic variables and operations

Question 1

1: No. A release fence synchronizes with all acquire operations and fences. If there was a third atomic<bool> z which was being manipulated in a third thread, the fence would synchronize with that third thread as well, which is unnecessary. That being said, they will act the same on x86, but that is because x86 has very strong synchronization. The architectures used on 1000 core systems tend to be weaker.

2: Yes, this is correct. A fence ensures that if you see anything that follows, you also see everything that preceded.

3: In general they are different, but realistically they will be the same. The compiler is allowed to reorder two relaxed operations on different variables, but may not introduce spurious operations. If the compiler has any way of being confident that it is going to need to read x, it may do so before reading y. In your particular case, this is very difficult for the compiler, but there are many similar cases where such reordering is fair game.

4: All of those are true. The atomic operations guarantee consistency. They do not always guarantee that things happen in an order you wanted, they just prevent pathological orders that ruin your algorithm.

5: Correct. Relaxed operations are truly atomic. They just don't synchronize any additional memory

6: For any given atomic object M, C++ guarantees that there is an "official" order for operations on M. You don't get to see the "latest" value for M so much as C++ and the processor guarantee that all threads will see a consistent series of values for M. If two threads increment the refcount, then decrement it, there is no guarentee which one will decrement it to 0, but there is a guarentee that one of them will see that it decremented it to 0. There is no way for both of them to see that they decremented 2->1 and 2->1, but somehow the refcount combined them to 0. One thread will always see 2->1 and the other will see 1->0.

Remember, memory order is more about synchronizing the memory around the atomic. The atomic gets handled properly no matter what memory order you use.

7: This one is trickier. The short version for 7 is that decrement is release order because some thread is going to have to run the destructor for x, and we want to make sure it sees all operations on x made on all threads. Using release order on the destructor satisfies this need because you can prove that it works. Whoever is responsible for deleting x acquires all changes before doing so (using a fence to make sure atomics in the deleter don't drift upward). In all cases where threads release their own references, it is obvious that all threads will have a release-order decrement before the deleter gets called. In cases where one thread increments the refcount and another decrements it, you can prove that the only valid way to do so is if the threads synchronize with eachother, so that the destructor sees the result of both threads. Failure to synchronize would create a race case no matter what, so the user is obliged to get it right.

Question 2

1

After pondering over #1 i have been convinced they are not equivalent by this argument §29.8.3 in [atomics.fences]:

A release fence A synchronizes with an atomic operation B that performs an acquire operation on an atomic object M if there exists an atomic operation X such that A is sequenced before X, X modifies M, and B reads the value written by X or a value written by any side effect in the hypothetical release sequence X would head if it were a release operation.

This paragraph says that a release fence can be synchronized only with an aquire operation. But release operation can be in addition syncronized with consume operation.

Question 3

Your void read_y_then_x() with the acquire fence has the fence in the wrong place. It should be placed between the two atomic loads. An acquire fence essentially makes all the load above the fence act somewhat like acquire loads, with the exception the happens before isn't established until you executed the fence.