Domanda

I read about boost's and std's (c++11) atomic type and operations over and over again and still I'm not sure I understand it right (and at some cases I don't understand it at all). So, I have a few questions about it.

My sources I use for learning:


Consider following snippet:

atomic<bool> x,y;

void write_x_then_y()
{
    x.store(true, memory_order_relaxed);
    y.store(true, memory_order_release);
}

#1: Is it equivalent to this next one?

atomic<bool> x,y;

void write_x_then_y()
{
    x.store(true, memory_order_relaxed);
    atomic_thread_fence(memory_order_release);    // *1
    y.store(true, memory_order_relaxed);          // *2
}

#2: Is following statement true?

Line *1 assures, that when operations done under this line (for example *2) are visible (for other thread using acquire), code above *1 will be visible too (with new values).


Next snipped extends the ones above:

void read_y_then_x()
{
    if(y.load(memory_order_acquire))
    {
        assert(x.load(memory_order_relaxed));
    }
}

#3: Is it equivalent to this next one?

void read_y_then_x()
{
    atomic_thread_fence(memory_order_acquire);    // *3
    if(y.load(memory_order_relaxed))              // *4
    {
        assert(x.load(memory_order_relaxed));     // *5
    }
}

#4: Are following statements true?

  • Line *3 assures that if some operations under release order (in other thread, like *2) is visible, every operation above the release order (for example *1) will be visible as well.
  • That means that assert at *5 will never fail (with false as default values).
  • But this does not assure that even if physically (in processor) *2 happens before before *3, it will be visible by snipped above (running in different thread) - function read_y_then_x() still can read old values. Only thing which is assured is, that if y is true, x will be also true.

#5: Incrementing (operation of adding 1) to an atomic integer can be memory_order_relaxed and no data are lost. Only problem is order and time of visibility of result.


According boost, following snipped is working reference counter:

#include <boost/intrusive_ptr.hpp>
#include <boost/atomic.hpp>

class X {
public:
  typedef boost::intrusive_ptr<X> pointer;
  X() : refcount_(0) {}

private:
  mutable boost::atomic<int> refcount_;
  friend void intrusive_ptr_add_ref(const X * x)
  {
    x->refcount_.fetch_add(1, boost::memory_order_relaxed);
  }
  friend void intrusive_ptr_release(const X * x)
  {
    if (x->refcount_.fetch_sub(1, boost::memory_order_release) == 1) {
      boost::atomic_thread_fence(boost::memory_order_acquire);
      delete x;
    }
  }
};

#6 Why is for decrementing used memory_order_release? How it works (in the context)? If what I wrote earlier is true, what makes returned value the most recent, especially when we use acquire AFTER reading and not before/during?

#7 Why there is acquire order after reference counter reach zero? We just read that the counter is zero and there is no other atomic variable used (pointer itself is not marked/used as such).

È stato utile?

Soluzione

1: No. A release fence synchronizes with all acquire operations and fences. If there was a third atomic<bool> z which was being manipulated in a third thread, the fence would synchronize with that third thread as well, which is unnecessary. That being said, they will act the same on x86, but that is because x86 has very strong synchronization. The architectures used on 1000 core systems tend to be weaker.

2: Yes, this is correct. A fence ensures that if you see anything that follows, you also see everything that preceded.

3: In general they are different, but realistically they will be the same. The compiler is allowed to reorder two relaxed operations on different variables, but may not introduce spurious operations. If the compiler has any way of being confident that it is going to need to read x, it may do so before reading y. In your particular case, this is very difficult for the compiler, but there are many similar cases where such reordering is fair game.

4: All of those are true. The atomic operations guarantee consistency. They do not always guarantee that things happen in an order you wanted, they just prevent pathological orders that ruin your algorithm.

5: Correct. Relaxed operations are truly atomic. They just don't synchronize any additional memory

6: For any given atomic object M, C++ guarantees that there is an "official" order for operations on M. You don't get to see the "latest" value for M so much as C++ and the processor guarantee that all threads will see a consistent series of values for M. If two threads increment the refcount, then decrement it, there is no guarentee which one will decrement it to 0, but there is a guarentee that one of them will see that it decremented it to 0. There is no way for both of them to see that they decremented 2->1 and 2->1, but somehow the refcount combined them to 0. One thread will always see 2->1 and the other will see 1->0.

Remember, memory order is more about synchronizing the memory around the atomic. The atomic gets handled properly no matter what memory order you use.

7: This one is trickier. The short version for 7 is that decrement is release order because some thread is going to have to run the destructor for x, and we want to make sure it sees all operations on x made on all threads. Using release order on the destructor satisfies this need because you can prove that it works. Whoever is responsible for deleting x acquires all changes before doing so (using a fence to make sure atomics in the deleter don't drift upward). In all cases where threads release their own references, it is obvious that all threads will have a release-order decrement before the deleter gets called. In cases where one thread increments the refcount and another decrements it, you can prove that the only valid way to do so is if the threads synchronize with eachother, so that the destructor sees the result of both threads. Failure to synchronize would create a race case no matter what, so the user is obliged to get it right.

Altri suggerimenti

1

After pondering over #1 i have been convinced they are not equivalent by this argument §29.8.3 in [atomics.fences]:

A release fence A synchronizes with an atomic operation B that performs an acquire operation on an atomic object M if there exists an atomic operation X such that A is sequenced before X, X modifies M, and B reads the value written by X or a value written by any side effect in the hypothetical release sequence X would head if it were a release operation.

This paragraph says that a release fence can be synchronized only with an aquire operation. But release operation can be in addition syncronized with consume operation.

Your void read_y_then_x() with the acquire fence has the fence in the wrong place. It should be placed between the two atomic loads. An acquire fence essentially makes all the load above the fence act somewhat like acquire loads, with the exception the happens before isn't established until you executed the fence.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top