“volatile” qualifier and compiler reorderings

https://stackoverflow.com/questions/2535148

22-09-2019
|

Question

A compiler cannot eliminate or reorder reads/writes to a volatile-qualified variables.

But what about the cases where other variables are present, which may or may not be volatile-qualified?

Scenario 1

volatile int a;
volatile int b;

a = 1;
b = 2;
a = 3;
b = 4;

Can the compiler reorder first and the second, or third and the fourth assignments?

Scenario 2

volatile int a;
int b, c;

b = 1;
a = 1;
c = b;
a = 3;

Same question, can the compiler reorder first and the second, or third and the fourth assignments?

Solution

The C++ standard says (1.9/6):

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.

In scenario 1, either of the changes you propose changes the sequence of writes to volatile data.

In scenario 2, neither change you propose changes the sequence. So they're allowed under the "as-if" rule (1.9/1):

... conforming implementations are required to emulate (only) the observable behavior of the abstract machine ...

In order to tell that this has happened, you would need to examine the machine code, use a debugger, or provoke undefined or unspecified behavior whose result you happen to know on your implementation. For example, an implementation might make guarantees about the view that concurrently-executing threads have of the same memory, but that's outside the scope of the C++ standard. So while the standard might permit a particular code transformation, a particular implementation could rule it out, on grounds that it doesn't know whether or not your code is going to run in a multi-threaded program.

If you were to use observable behavior to test whether the re-ordering has happened or not (for example, printing the values of variables in the above code), then of course it would not be allowed by the standard.

OTHER TIPS

For scenario 1, the compiler should not perform any of the reorderings you mention. For scenario 2, the answer might depend on:

and whether the b and c variables are visible outside the current function (either by being non-local or having had their address passed
who you talk to (apparently there is some disagreement about how string volatile is in C/C++)
your compiler implementation

So (softening my first answer), I'd say that if you're depending on certain behavior in scenario 2, you'd have to treat it as non-portable code whose behavior on a particular platform would have be determined by whatever the implementation's documentation might indicate (and if the docs said nothing about it, then you're out of luck with a guaranteed behavior.

from C99 5.1.2.3/2 "Program execution":

Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.

...

(paragraph 5) The least requirements on a conforming implementation are:

At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred.

Here's a little of what Herb Sutter has to say about the required behavior of volatile accesses in C/C++ (from "volatile vs. volatile" http://www.ddj.com/hpc-high-performance-computing/212701484) :

what about nearby ordinary reads and writes -- can those still be reordered around unoptimizable reads and writes? Today, there is no practical portable answer because C/C++ compiler implementations vary widely and aren't likely to converge anytime soon. For example, one interpretation of the C++ Standard holds that ordinary reads can move freely in either direction across a C/C++ volatile read or write, but that an ordinary write cannot move at all across a C/C++ volatile read or write -- which would make C/C++ volatile both less restrictive and more restrictive, respectively, than an ordered atomic. Some compiler vendors support that interpretation; others don't optimize across volatile reads or writes at all; and still others have their own preferred semantics.

And for what it's worth, Microsoft documents the following for the C/C++ volatile keyword (as Microsoft-sepcific):

A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.

A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.

This allows volatile objects to be used for memory locks and releases in multithreaded applications.

Volatile is not a memory fence. Assignments to B and C in snippet #2 can be eliminated or performed whenever. Why would you want the declarations in #2 to cause the behavior of #1?

Some compilers regard accesses to volatile-qualified objects as a memory fence. Others do not. Some programs are written to require that volatile works as a fence. Others aren't.

Code which is written to require fences, running on platforms that provide them, may run better than code which is written to not require fences, running on platforms that don't provide them, but code which requires fences will malfunction if they are not provided. Code which doesn't require fences will often run slower on platforms that provide them than would code which does require the fences, and implementations which provide fences will run such code more slowly than those that don't.

A good approach may be to define a macro semi_volatile as expanding to nothing on systems where volatile implies a memory fence, or to volatile on systems where it doesn't. If variables that need to have accesses ordered with respect to other volatile variables but not to each other are qualified as semi-volatile, and that macro is defined correctly, reliable operation will be achieved on systems with or without memory fences, and the most efficient operation that can be achieved on systems with fences will be achieved. If a compiler actually implements a qualifier that works as required, semivolatile, it could be defined as a macro that uses that qualifier and achieve even better code.

IMHO, that's an area the Standard really should address, since the concepts involved are applicable on many platforms, and any platform where fences aren't meaningful can simply ignore them.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow