This is actually a bug which was fixed in the 4.7 branch:
I think you need to use __sync_synchronize
or something like __asm__ __volatile__ ( "mfence" ::: "memory" )
Some people like to be very rigorous about which synchronization operation they need, but I think using mfence
all the time will suffice for common cases.