Which is the fastest way to get the absolute value of a number

https://stackoverflow.com/questions/664852

21-08-2019
|

Question

Which is the fastest way to implement an operation that returns the absolute value of a number?

x=root(x²)

if !isPositive(x):
    x=x*(-1)

Actually this question can be translated as, how fast is an if (and why please).

My college programing professors always told me to avoid ifs for they are extremely slow, but I always forgot to ask how slow and why. Does anybody here know?

Solution

Conditionals are slower than plain arithmetic operations, but much, much faster than something as silly as calculating the square root.

Rules of thumb from my assembly days:

Integer or bitwise op: 1 cycle
Floating-point add/sub/mul: 4 cycles
Floating-point div: ~30 cycles
Floating-point exponentiation: ~200 cycles
Floating-point sqrt: ~60 cycles depending on implementation
Conditional branch: avg. 10 cycles, better if well-predicted, much worse if mispredicted

OTHER TIPS

There is a great trick to calculate the absolute value of a 2s-complement integer without using an if statement. The theory goes, if the value is negative you want to toggle the bits and add one, otherwise you want to pass the bits through as is. A XOR 1 happens to toggle A and A XOR 0 happens to leave A intact. So you want do something like this:

  uint32_t temp = value >> 31;     // make a mask of the sign bit
  value ^= temp;                   // toggle the bits if value is negative
  value += temp & 1;               // add one if value was negative

In principle, you can do it in as few as three assembly instructions (without a branch). And you'd like to think that the abs() function that you get with math.h does it optimally.

No branches == better performance. Contrary to @paxdiablo's response above, this really matters in deep pipelines where the more branches you have in your code, the more likely you are to have your branch predictor get it wrong and have to roll-back, etc. If you avoid branching where possible, things will keep moving along at full throttle in your core :).

Ugh, your teachers actually told you that? The rule most people follow is to make your code readable first, and then tweak any performance problems after they are proven to actually be problems. 99.999% of the time you are never going to see a performance problem because you used one too many if statements. Knuth said it best, "premature optimization is the root of all evil".

Calculating the square root is probably one of the worst things you could do because it is really slow. Usually there is a library function for doing this; something like Math.Abs(). Multiplying with -1 is also unnecessary; just return -x. So a good solution would be the following.

(x >= 0) ? x : -x

The compiler will probably optimize this to a single instructions. Conditions may be quite expensive on modern processors because of the long execution pipelines -the calculations must be thrown away if a branch was misspredicted and the processor started executing the instructions from the wrong code path. But because of the mentioned compiler optimization you need not care in this case.

For completeness, here's a way to do it for IEEE floats on x86 systems in C++:

*(reinterpret_cast<uint32_t*>(&foo)) &= 0xffffffff >> 1;

The if variant will almost certainly be blindingly fast compared to the square root, since it normally translates to a conditional jump instruction at the machine code level (following the evaluation of the expression, which may be complex, but not in this case since it's a simple check for less than 0).

Taking the square root of a number is likely to be much slower (Newton's method, for example, would use many many if statements at the machine code level).

The likely source of confusion is the fact that if invariably lead to changing the instruction pointer in a non-sequential manner. This can slow down processors that pre-fetch instructions into a pipeline since they have to re-populate the pipeline when the address changes unexpectedly.

However, the cost of that would be minuscule compared to performing a square root operation as opposed to a simple check-and-negate.

Which is the fastest way to get the absolute value of a number

I think the "right" answer isn't here actually. The fastest way to get the absolute number is probably to use the Intel Intrinsic. See https://software.intel.com/sites/landingpage/IntrinsicsGuide/ and look for 'vpabs' (or another intrinsic that does the job for your CPU). I'm pretty sure it'll beat all the other solutions here.

If you don't like intrinsics (or cannot use them or ...), you might want to check if the Compiler is smart enough to figure out if a call to 'native absolute value' (std::abs in C++ or Math.Abs(x) in C#) will change automagically into the intrinsic - basically that involves looking at the disassembled (compiled) code. If you're in a JIT, be sure that JIT optimizations aren't disabled.

If that also doesn't give you the optimized instructions, you can use the method described here: https://graphics.stanford.edu/~seander/bithacks.html#IntegerAbs .

The modulo operation is used to find a remainder, you mean absolute value. I modified the question because it should be if !pos(x) then x = x*-1. (not was missing)

I wouldn't worry about the efficiency of an if statement. Instead focus on the readability of your code. If you identify that there is an efficiency problem, then focus on profiling your code to find real bottlenecks.

If you want to keep an eye out for efficiency while you code, you should only worry about the big-O complexity of your algorithms.

If statements are very efficient, it evaluates whatever expression and then simply changes the program counter based on that condition. The program counter stores the address of the next instruction to be executed.

Mulitplication by -1 and checking if a value is greater than 0 both can be reduced to a single assembly instruction.

Finding the root of a number and squaring that number first is definitely more operations than the if with a negation.

The time taken to do a square root is much greater than the time taken to do an conditional. If you have been taught to avoid conditionals because they are slow, then you have been misinformed. They are a good deal slower than trivial operations like adding or subtracting integers or bit shifting - which is why unrolling loops can be of benefit only if you are doing such trivial operations. But in the grand scheme of things conditionals are good and fast, not bad and slow. To do something as complicated as call a function or calculate a square root to avoid a conditional statement is crazy.

Also, instead of (x = x * -1) why not do (x = 0 - x)? Maybe the compiler will optimize them the same, but isn't the second one simpler anyway?

Are you using 8086 assembly? ;-)

                ; abs value of AX
   cwd          ; replicate the high bit into DX
   xor  ax, dx  ; take 1's complement if negative; no change if positive
   sub  ax, dx  ; AX is 2's complement if it was negative The standard
                : absolute value method works on any register but is much
                ; slower:

   or   bx, bx  ; see if number is negative
   jge  notneg  ; if it is negative...
   neg  bx      ; ...make it positive
notneg:         ; jump to here if positive

(flagrantly stolen)

If you are simply comparing the absolute values of two numbers (e.g. you don't need the absolute value of either after the comparison) then just square both values to make both positive (remove the sign of each value), the larger square will be greater than the smaller square.

What is faster is very dependent on what compiler and what CPU you're targeting. On most CPUs and all compilers x = (x>=0)? x:-x; is fastest way to get absolute value, but in fact, often standard functions already offer this solution (e.g. fabs()). It is compiled into compare followed by conditional assignment instruction(CMOV), not into conditional jump. Some platforms lack of that instruction though. Although, Intel (but not Microsoft or GCC) compiler would automatically convert if() into conditional assignment, and even would try optimize cycles (if possible).

Branching code in general is slower than conditional assignment, if CPU uses statistical prediction. if() might be slower in average if operation gets repeated multiple times and result of condition is constantly changing. CPUs like Intel, would start to calculate both branches, and would drop the invalid one, In case of large if() bodies or large number of cycles that might be critical.

sqr() and sqrt() on modern Intel CPUs are single built-in instruction and aren't slow, but they are imprecise, and loading registers would take time as well.

Related question: Why is a CPU branch instruction slow?

Most likely, professor wanted student to do research on this matter, it's semi-provocative question\task that would do only good, if student would learn think independently and look for additional sources.

I'm doing some retro graphics programming in C for 8088/8086 and calling abs() is time consuming so I've replaced it with:

/* assuming 'i' is int; this WILL NOT WORK on floating point */
if (i < 0) {
    i = ~i + 1;
}

The reason this is faster is because it essentially trades a CALL in assembly for a JNE. Calling a method changes a couple of registers, pushes several more, pushes arguments onto the stack, and can flush the prefetch queue. Plus these actions need to be reversed at the end of the function and all of this is very expensive to the CPU.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow