문제

I know that the following is undefined because I am trying to read and write the value of variable in the same expression, which is

int a=5;
a=a++;

but if it is so then why the following code snippet is not undefined

int a=5;
a=a+1;

as here also I am trying to modify value of a and write to it at the same time.

Also explain why the standard is not curing this or removing this undefined behavior, in spite of the fact that they know that it is undefined?

도움이 되었습니까?

해결책 3

Long story short, you can find every defined behavior in the standard. Everything that is not mentioned there as defined - is undefined.

Intuitive explanation to your example:

a=a++;

You want to modify the variable a two times in a single statement.

1) a= //first time
2) a++ //second time

If you look here:

a=a+1;

You modify variable a only once:

a= // (a+1) - doesn't change the value of a

Why don't the standard define a=a++ behavior?

One of the possible reasons is: Compiler can perform optimizations. The more cases you define in a standard, the less freedom compiler has to optimize your code. Because different architectures can have different increasing instructions implementations, the compiler wouldn't use all processor instructions in case they will break the standard behavior. Or in some cases compiler can change the evaluation order, but this restriction will force a compiler to disable such optimizations if you want to modify something twice.

다른 팁

The reason why it is undefined is not that you read and write, it is that you write twice.

a++ means read a and increment it after reading, but we don't know if the ++ will happen before the assignment with = (in which case the = will overwrite with the old value of a) or after, in which case a will be incremented.

Just use a++; :)

a = a + 1 does not have the problem as a is only written once.

why the following code snippet is not undefined

int a=5;
a=a+1;  

The Standard states that

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.

In case of a = a + 1; a is modified only once and the prior value of a accessed only to determine the value to be stored in a.
While in case of a=a++;, a is modified more than once-- by ++ operator in sub-expression a++ and by = operator in assigning the result to left a. Now it is not defined that which modification, either by ++ or by =, will takes place first.

Almost all modern compiler with flag -Wall would raise a warning, on compiling the first snippet, like:

[Warning] operation on 'a' may be undefined [-Wsequence-point]

Further reading: How can I understand complex expressions like the ones in this section, and avoid writing undefined ones?

The ++ operator will add one to a, meaning the variable a will become a+1. In effect, the following two statements are equal:

a++;
a = a + 1;

The last statement, a + 1, will not increase a - it will generate a result that has value a + 1. If you want a to become a+1, you have to assign the result of a + 1 to a with

a = a + 1;

The reason the first statement you made won't work is because you write something like

a = (a = a + 1);

Others have already talked about the details of your specific example, so I'll add some general information and tools that help to catch undefined behaviour.

There is no ultimate tool or method to catch undefined behaviour, so even if you utilize all of these tools, there is no guarantee that there isn't something in your code that isn't undefined. But IME these will catch quite a lot of the common issues. I'm not listing the standard good practices of software development like unit-testing, that you should be using anyway.

  • clang(-analyze) has an several options that can help with catching undefined behaviour, both at compile-time and at runtime. It has -ftrapv, it has newly acquired support for canary values, its address sanitizer, --fcatch-undefined-behaviour, et cetera.

  • gcc also has several options to catch undefined behaviour, such as mudflaps, its address sanitizer, the stack protector.

  • valgrind is a fantastic tool for finding memory-related undefined behaviour at runtime.

  • frama-c is a static analysis tool that can find and visualize undefined behaviour. It's ability to find dead code (undefined behaviour can oftentimes cause other portions of code to become dead) is a pretty useful tool to track down potential security concerns. frama-c has many more advanced features, but can arguably be more difficult to use than...

  • Other commercial static analysis tools that can catch undefined behaviour exist, such as PVS-studio, klocwork, et cetera. These usually cost a lot, though.

  • Compile with different compilers and for strange architectures. If you can, why not compile and run your code on a 8-bit AVR chip? A raspberry pi (32-bit ARM)? Compile it to javascript using emscripten and run it in V8? Doing this tends to be a practical fashion of catching undefined behaviour that would cause crashes down the line (but does little/nothing for catching lurking UB that may e.g. cause security issues).

Now, as to the ontological reasons as to why undefined behaviour exists... It is basically for performance and ease-of-implementation reasons. Many things that are UB in C allow the compiler to optimize certain things that other languages are not capable of optimizing. If you e.g. compare how java, python and C handle overflow of signed integer types, you can see that on one extreme end, python completely well-defines it in a fashion convenient for the programmer -- ints can in fact become infinitely big. C on the other end of the spectrum leaves it undefined -- it is your responsibility to never overflow your signed integers. Java is somewhat inbetween.

But on the other hand, that means that there is no knowing in python what work the "int + int" operation will actually perform when executed. It may execute many hundreds of instructions, take a round-trip through the operating system to allocate some memory, et cetera. This is pretty bad if you care a lot about performance, or more specifically, consistent performance. C on the other end of the spectrum allows the compiler to map "+" to the CPUs native instruction that adds integers (if one exists.) Sure, different CPUs may handle overflows differently, but since C leaves that undefined, that's fine -- you as the programmer have to take care of not overflowing your ints. This means that C gives the compiler the option to compile your "int + int" operations to a single machine instruction on pretty much all CPUs - something compilers can and do take advantage of.

Note that C makes no guarantee that + actually maps directly to a native CPU instruction, it just leaves the possibility for the compiler to do it that way open -- and obviously this is something any compiler-writer would be eager to take advantage of. Javas method of defining signed integer overflow is less unpredictable (in terms of performance) than pythons, but may not lead to + being turned into a single CPU instruction on many CPU types where C would allow it.

So essentially, C attempts to embrace undefined behaviour, and opts for (consistent) speed and ease-of-implementation where other languages opt for safety or predictable behaviour (from the programmers perspective.) That isn't necessary a good decision with e.g. respect to safety/security, but that's where C stands. It boils down to "know the appropriate tool for the job at hand", and there are definitely many cases where the performance predictability C gives you is absolutely essential.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top