Why does a C/C++ program often have optimization turned off in debug mode?

https://stackoverflow.com/questions/69250

09-06-2019
|

Question

In most C or C++ environments, there is a "debug" mode and a "release" mode compilation.
Looking at the difference between the two, you find that the debug mode adds the debug symbols (often the -g option on lots of compilers) but it also disables most optimizations.
In "release" mode, you usually have all sorts of optimizations turned on.
Why the difference?

Solution

Without any optimization on, the flow through your code is linear. If you are on line 5 and single step, you step to line 6. With optimization on, you can get instruction re-ordering, loop unrolling and all sorts of optimizations.
For example:


void foo() {
1:  int i;
2:  for(i = 0; i < 2; )
3:    i++;
4:  return;

In this example, without optimization, you could single step through the code and hit lines 1, 2, 3, 2, 3, 2, 4

With optimization on, you might get an execution path that looks like: 2, 3, 3, 4 or even just 4! (The function does nothing after all...)

Bottom line, debugging code with optimization enabled can be a royal pain! Especially if you have large functions.

Note that turning on optimization changes the code! In certain environment (safety critical systems), this is unacceptable and the code being debugged has to be the code shipped. Gotta debug with optimization on in that case.

While the optimized and non-optimized code should be "functionally" equivalent, under certain circumstances, the behavior will change.
Here is a simplistic example:

    int* ptr = 0xdeadbeef;  // some address to memory-mapped I/O device
    *ptr = 0;   // setup hardware device
    while(*ptr == 1) {    // loop until hardware device is done
       // do something
    }

With optimization off, this is straightforward, and you kinda know what to expect. However, if you turn optimization on, a couple of things might happen:

The compiler might optimize the while block away (we init to 0, it'll never be 1)
Instead of accessing memory, pointer access might be moved to a register->No I/O Update
memory access might be cached (not necessarily compiler optimization related)

In all these cases, the behavior would be drastically different and most likely wrong.

OTHER TIPS

Another crucial difference between debug and release is how local variables are stored. Conceptually local variables are allocated storage in a functions stack frame. The symbol file generated by the compiler tells the debugger the offset of the variable in the stack frame, so the debugger can show it to you. The debugger peeks at the memory location to do this.

However, this means every time a local variable is changed the generated code for that source line has to write the value back to the correct location on the stack. This is very inefficient due to the memory overhead.

In a release build the compiler may assign a local variable to a register for a portion of a function. In some cases it may not assign stack storage for it at all (the more registers a machine has the easier this is to do).

However, the debugger doesn't know how registers map to local variables for a particular point in the code (I'm not aware of any symbol format that includes this information), so it can't show it to you accurately as it doesn't know where to go looking for it.

Another optimization would be function inlining. In optimized builds the compiler may replace a call to foo() with the actual code for foo everywhere it is used because the function is small enough. However, when you try to set a breakpoint on foo() the debugger wants to know the address of the instructions for foo(), and there is no longer a simple answer to this -- there may be thousands of copies of the foo() code bytes spread over your program. A debug build will guarantee that there is somewhere for you to put the breakpoint.

Optimizing code is an automated process that improves the runtime performance of the code while preserving semantics. This process can remove intermediate results which are unncessary to complete an expression or function evaluation, but may be of interest to you when debugging. Similarly, optimizations can alter the apparent control flow so that things may happen in a slightly different order than what appears in the source code. This is done to skip unnecessary or redundant calculations. This rejiggering of code can mess with the mapping between source code line numbers and object code addresses making it hard for a debugger to follow the flow of control as you wrote it.

Debugging in unoptimized mode allows you to see everything you've written as you've written it without the optimizer removing or reordering things.

Once you are happy that your program is working correctly you can turn on optimizations to get improved performance. Even though optimizers are pretty trustworthy these days, it's still a good idea to build a good quality test suite to ensure that your program runs identically (from a functional point of view, not considering performance) in both optimized and unoptimized mode.

The expectation is for the debug version to be - debugged! Setting breakpoints, single-stepping while watching variables, stack traces, and everything else you do in a debugger (IDE or otherwise) make sense if every line of non-empty, non-comment source code matches some machine code instruction.

Most optimizations mess with the order of machine codes. Loop unrolling is a good example. Common subexpressions can be lifted out of loops. With optimization turned on, even the simplest level, you may be trying to set a breakpoint on a line that, at the machine code level, doesn't exist. Sometime you can't monitor a local variable due to it being kept in a CPU register, or perhaps even optimized out of existence!

If you're debugging at the instruction level rather than the source level, it's an awful lot for you easier to map unoptimized instructions back to the source. Also, compilers are occasionally buggy in their optimizers.

In the Windows division at Microsoft, all release binaries are built with debugging symbols and full optimizations. The symbols are stored in separate PDB files and do not affect the performance of the code. They don't ship with the product, but most of them are available at the Microsoft Symbol Server.

Another of the issues with optimizations are inline functions, also in the sense that you will always single-step through them.

With GCC, with debugging and optimizations enabled together, if you don't know what to expect you will think that the code is misbehaving and re-executing the same statement multiple times - it happened to a couple of my colleagues. Also debugging info given by GCC with optimizations on tend to be of poorer quality than they could, actually.

However, in languages hosted by a Virtual Machine like Java, optimizations and debugging can coexist - even during debugging, JIT compilation to native code continues, and only the code of debugged methods is transparently converted to an unoptimized version.

I would like to emphasize that optimization should not change the behaviour of the code, unless the used optimizer is buggy, or the code itself is buggy and relies on partially undefined semantics; the latter is more common in multithreaded programming or when inline assembly is also used.

Code with debugging symbols are larger which may mean more cache misses, i.e. slower, which may be an issue for server software.

At least on Linux (and there's no reason why Windows should be different) debug info are packaged in a separate section of the binary, and are not loaded during normal execution. They can be split into a different file to be used for debugging. Also, on some compilers (including Gcc, I guess also with Microsoft's C compiler) debugging info and optimizations can be both enabled together. If not, obviously the code is going to be slower.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow