asio implicit strand and data synchronization

Question

Before the operation invokes the user handler, Boost.Asio uses a memory fence to provide the appropriate memory reordering without forcing mutual execution of handler execution. Thus, thread B would observe changes to memory that occurred within the context of thread A.

C++03 did not specify requirements for memory visibility with regards to multi-threaded execution. However, C++11 defines these requirements in § 1.10 Multi-threaded executions and data races, as well as the Atomic operations and Thread support library sections. Boost and C++11 mutexes do perform the appropriate memory reordering. For other implementations, it is worth checking the mutex library's documentation to verify memory reordering occurs.

Boost.Asio memory fences are an implementation detail, and thus always subject to change. Boost.Asio abstracts itself from the architecture/compiler specific implementations through a series of conditional defines within asio/detail/fenced_block.hpp where only a single memory barrier implementation is included. The underlying implementation is contained within a class, for which a fenced_block alias is created via a typedef.

Here is a relevant excerpt:

#elif defined(__GNUC__) && (defined(__hppa) || defined(__hppa__))
# include "asio/detail/gcc_hppa_fenced_block.hpp"
#elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
# include "asio/detail/gcc_x86_fenced_block.hpp"
#elif ...

...

namespace asio {
namespace detail {

...

#elif defined(__GNUC__) && (defined(__hppa) || defined(__hppa__))
typedef gcc_hppa_fenced_block fenced_block;
#elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
typedef gcc_x86_fenced_block fenced_block;
#elif ...

...

} // namespace detail
} // namespace asio

The implementations of the the memory barriers are specific to the architecture and compilers. Boost.Asio has a family of asio/detail/*_fenced_blocked.hpp header files. For example, the win_fenced_block uses InterlockedExchange for Borland; otherwise it uses the xchg assembly instruction, which has an implicit lock prefix when used with a memory address. For gcc_x86_fenced_block, Boost.Asio uses the memory assembly instruction.

If you find yourself needing to use a fence, then consider the Boost.Atomic library. Introduced in Boost 1.53, Boost.Atomic provides an implementation of thread and signal fences based the C++11 standard. Boost.Asio has been using its own implementation of memory fences prior to the Boost.Atomic being added to Boost. Also, the Boost.Asio fences are scoped based. fenced_block will perform an acquire in its constructor, and a release in its destructor.