Question

I just was wondering how the pow function of the math.h library works, does it implement the simplest sequencial algorithm or does it use another one else?

I just know the repeated squaring algorithm, which reports O(log n), maybe this is the implemented algorithm by the pow function?

So I just made some tests using the sequential algorithm vs pow and found out that the first version is almost 3 times faster than the second. Does calling functions really punish that much the performance of this test? Why?

Any other comments explaining what's happening, or how pow is implemented are welcome.

EDIT: I was wrong, pow is 3 times faster than the sequential algorithm.

Was it helpful?

Solution

The implementation of pow() in math.h is a lot more complex than that - take a look at this freely available implementation (link).

A problem with repeated squaring is that it is not general enough to deal with fractional powers. The pow() from math.h must deal with it, so it is necessarily slower on some of the test cases. However, since the repeated squaring function does not have the same functionality, the comparison is not apples-to-apples.

Generally speaking, it is much easier to optimize for performance if you do not need to handle the general case. For example, if you never raise numbers to fractional powers, you could potentially make an algorithm that beats the library function 3:1 in a micro-benchmark. This should come with understanding that the applicability of the "faster" function is not as wide.

OTHER TIPS

According to the ANSI C99 standard, section 7.12.7.4:

Description

The pow functions compute x raised to the power y. A domain error occurs if x is finite and negative and y is finite and not an integer value. A domain error may occur if x is zero and y is less than or equal to zero.

Returns

The pow functions return x^y.

In other words, it doesn’t specify the exact algorithm to be used. You’d have to look at the source code for the C/C++ standard library that you’re using. I would assume most library authors have used a highly-optimized algorithm.

Update: In comments, you say that you are using MinGW32. That links against Microsoft’s runtime, msvcrt. Although it’s not open source, looking at Microsoft’s documentation all we know is that it uses SSE2. It’s likely very efficient.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top