Question

As I understand from my reading, undefined-behavior is the result of leaving the compiler with several non-identical alternatives at compile time. However, wouldn't that mean that if one were to follow strict coding practice (like putting each assignment and each equality in a separate statement, proper debugging and commenting) then it shouldn't pose a significant problem in finding the source of the undefined-behavior.

Further, there are, for each error that comes up, if you identify the code, you should know what statements can be used in that particular statement's stead, correct?

EDIT: I'm not interested in places where you have written code that you didn't mean to write. I'm interested in examples where code that is sound by mathematical logic fails to work.

Also, I consider 'good coding practice' to be strong informative comments every few lines, proper indentation, and debugging dumps on a regular basis.

Was it helpful?

Solution

Undefined behavior isn't necessarily leaving the compiler with multiple alternatives. Most commonly it is simply doing something that doesn't make sense.

For example, take this code:

int arr[2];
arr[200] = 42;

this is undefined behavior. It's not that the compiler was given multiple alternatives to choose from. it's just that what I'm doing does not make sense. Ideally, it should not be allowed in the first place, but without potentially expensive runtime checking, we can't guarantee that something like this won't occur in our code. So in C++, the rule is simply that the language specifies only the behavior of a program that sticks to the rules. If it does something erroneous like in the above example, it is simply undefined what should happen.

Now, imagine how you're going to detect this error. How is it going to surface? It might never seem to cause any problems. Perhaps we just so happen to write into memory that's mapped to the process (so we don't get an access violation), but is never otherwise used (so no other part of the program will read our garbage value, or overwrite what we wrote). Then it'll seem like the program is bug-free and works just fine.

Or it might hit an address that's not even mapped to our process. Then the program will crash immediately.

Or it might hit an address that's mapped to our process, but at some point later will be used for something. Then all we know is that sooner or later, the function reading from that address will get an unexpected value, and it'll behave weird. That part is easy to spot in the debugger, but it doesn't tell us anything about when or from where that garbage value was written. So there's no simple way to trace the error back to its source.

OTHER TIPS

First, some definitions from the C++03 standard:

1.3.5 implementation-defined behavior

Behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation shall document

1.3.12 undefined behavior

Behavior, such as might arises upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements. Undefined behavior may also be expected when this International Standard omits the description of any explicit definition or behavior.

1.3.13 unspecified behavior

Behavior, for a well-formed program construct and correct data, that depends on the implementation. The implementation is not required to document which behavior occurs.

Even though unspecified behavior could be called UB, I've never seen that, and UB always means undefined behavior. Throughout the standard are statements similar to "doing X is undefined behavior," but sometimes you run into a case that's simply not covered.

To put the definition another way, if you have any undefined behavior anywhere, then all bets are off. As far as the standard is concerned, your program could do anything from inviting your mother-in-law over for SuperBowl weekend to running nethack. Due to UB's very nature you can't test for it, and you can't expect any help from the compiler. (Though for some trivial, common errors compilers do generally produce diagnostics.)

Usually something is defined as UB because it just doesn't make sense logically (e.g. accessing an array out of bounds), but also often because it would require the implementation to do too much work to prevent—often at runtime. Remember C++ is derived from C, and being able to produce highly-optimized programs is a major goal of both languages. To this end, the languages defer to the programmer to make sure the code is correct in these situations, related to the "you don't pay for what you don't use" principle.

So, finally, UB is bad, very bad; avoid it at all costs. However, the hard part of UB isn't knowing what it is or under what circumstances it occurs; the hard part is recognizing when you invoke UB. For example:

std::string s = "abc";
char& c = s[0];
cout.write(s.data(), s.length());
c = '-';

Looks perfectly reasonable, right? Nope, this is UB, yet it will work as you expect on all the popular implementations.

I'm not sure if there's a formal definition of "undefined behavior", but following good coding standards can reduce ambiguity and lead to fewer compile and runtime defects.

However, getting two programmers to agree on what "good coding standards" is a complicated and error-prone process.

To your second question, yes compilers will generally output an error code that you can use to fix the problem

As I understand from my reading, undefined-behavior is the result of leaving the compiler with several non-identical alternatives at compile time.

While that may be one source of undefined behavior, you're speaking too abstractly. You need a specific example of what you mean by "non-identical alternatives at compile time."

If by "follow strict coding practice," you mean don't use logic that results in undefined behavior, then yes (because there would be no undefined behavior). Tracking down a bug because of undefined behavior may or may not be easier than tracking one caused by a logic error.

Note that code which results in "undefined behavior" is still legal C++ code. I consider it a class of code/logic that should only very seldomly be used, when that "undefined behavior" is predictable for a given program on a given platform with a given implementation. You will find cases that what the language considers "undefined behavior" will in fact be defined for a particular environment/set of constraints.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top