math.h pow vs manual power performance

Question 1

The implementation of pow() in math.h is a lot more complex than that - take a look at this freely available implementation (link).

A problem with repeated squaring is that it is not general enough to deal with fractional powers. The pow() from math.h must deal with it, so it is necessarily slower on some of the test cases. However, since the repeated squaring function does not have the same functionality, the comparison is not apples-to-apples.

Generally speaking, it is much easier to optimize for performance if you do not need to handle the general case. For example, if you never raise numbers to fractional powers, you could potentially make an algorithm that beats the library function 3:1 in a micro-benchmark. This should come with understanding that the applicability of the "faster" function is not as wide.

Question 2

According to the ANSI C99 standard, section 7.12.7.4:

Description

The pow functions compute x raised to the power y. A domain error occurs if x is finite and negative and y is finite and not an integer value. A domain error may occur if x is zero and y is less than or equal to zero.

Returns

The pow functions return x^y.

In other words, it doesn’t specify the exact algorithm to be used. You’d have to look at the source code for the C/C++ standard library that you’re using. I would assume most library authors have used a highly-optimized algorithm.

Update: In comments, you say that you are using MinGW32. That links against Microsoft’s runtime, msvcrt. Although it’s not open source, looking at Microsoft’s documentation all we know is that it uses SSE2. It’s likely very efficient.