Question

I am looking for an efficient algorithm to compute $$f(x) = x^a, x\in[0, 1] \land a\,\text{constant}$$

If it helps, $a$ is either $2.2$, or $1/2.2$. Knowing valid values for $x$ and $a$ could make it possible to take some shortcuts a general pow function cannot make.

General tailor series expansion has some serious convergence problems when $x \to 0^+$, at least when $a < 1$.

  • The algorithm must be faster than pow
  • The maximum error on the specified interval must be less than or equal to $10^{-4}$.
Was it helpful?

Solution

The Weierstrass Approximation Theorem states that any continuous function can be approximated on a bounded interval $[a, b]$ as closely as desired by a polynomial. In practice, instead of using Taylor series expansions, one uses polynomial minimax approximations $p(x)$ which approximate the function on the chosen interval with the minimal maximum error (thus the name).

When one looks at the graph of the error function $f(x) - p(x)$ one observes that it oscillates such that all extrema are of equal magnitude and neighboring extrema have opposite sign (equi-oscillation property), and that there are a total of $n+2$ extrema for a polynomial of degree $n$. This behavior of the error function can be used to construct the minimax polynomial, as also noted in gnasher729's answer.

The standard algorithm used to do this is called the Remez algorithm, named after the Russian mathematician who published it in 1927. This functionality can be found in ready-to-use form in widely used tools such as Maple and Mathematica, as well as the software tool Sollya, which provides an fpminimax command to generate minimax polynomial approximations. One can also create one's own implementation of the Remez algorithm, and I have used such code for this answer.

While the minimax polynomial is well defined mathematically, when the coefficients need to be stored in a finite-precision floating-point format and the polynomial evaluated is evaluated in finite-precision floating-point arithmetic, the error function $(f(x) - p(x)$ becomes deformed and no longer satisfies the equi-oscillation criterion. That means we wind up with a polynomial that is merely close to the true minimax polynomial. Many tools do not adjust for this effect at all, delivering coefficients based on high-precision internal computations. Sollya takes the finite-precision storage of the coefficients into consideration, and thus can often deliver superior results.

A minimax polynomial of degree five suffices to approximate $x^{2.2}$ to within the accuracy desired by OP. $f(x)=p(x)+\epsilon, \:\: |\epsilon| < 7.4\cdot10^{-5}$. We can observe the equi-oscillation property of the error function $f(x) - p(x)$ and its seven extrema by plotting it, e.g. with Wolfram Alpha. Exemplary C code looks as follows:

/* Approximate x**(2.2) on [0,1] with minimax polynomial. Max. err < 7.4e-5 */
float pow_2p2_minimax (float x)
{
    const float c0 =  1.06425546e-1f; //  0x1.b3eb46p-4
    const float c1 = -3.56089801e-1f; // -0x1.6ca2cep-2
    const float c2 =  5.86735249e-1f; //  0x1.2c6890p-1
    const float c3 =  6.73461437e-1f; //  0x1.58cff0p-1
    const float c4 = -1.05324369e-2f; // -0x1.59207cp-7
    const float c5 =  7.38649687e-5f; //  0x1.35cfe8p-14
    return ((((c0 * x + c1) * x + c2) * x + c3) * x + c4) * x + c5;
}

Note that the minimax polynomial does not return zero at the origin. If we desire this property, we can add this additional constraint by using $f(x) = x^{2}p(x) +\epsilon$. An example, with $|\epsilon| < 1.33 \cdot 10^{-4}$, is shown in this exemplary C implementation (plot of error function):

/* Approximate x**(2.2) on [0,1] with constrained minimax. Max. err < 1.33e-4 */
float pow_2p2_constraint (float x)
{
    const float c0 = -3.66555989e-1f; // -0x1.775a74p-2
    const float c1 =  1.19028902e+0f; //  0x1.30b6c8p+0
    const float c2 = -1.55231142e+0f; // -0x1.8d6448p+0 
    const float c3 =  1.19035530e+0f; //  0x1.30bb20p+0
    const float c4 =  5.38091123e-1f; //  0x1.1380aep-1
    return x * x * ((((c0 * x + c1) * x + c2) * x + c3) * x + c4);
}

The computation of $x^{1/2.2}$ is a bit trickier. The easiest approach that does not suffer from premature underflow seems to be the following: Compute $y = \sqrt{\sqrt{x}} = x^{\frac{1}{4}}$, then compute $y^{\frac{20}{11}} = p(y) + \epsilon$ = $x^{\frac{5}{11}} = x^{1/2.2}$. The square root function is very fast on many modern processors, but slow on others, so this may not be a universal win, depending on how pow() is implemented. To achieve the error bound desired by OP, a minimax polynomial of degree six is needed, resulting in $|\epsilon| < 7.5 \cdot 10^{-5}$. Exemplary C code looks like so:

/* Approximate x**(1/2.2) on [0,1] with minimax polynomial. Max. err < 7.5e-5 */
float pow_0p4545_minimax (float x)
{
    const float c0 =  3.07896197e-1f; //  0x1.3b4924p-2
    const float c1 = -1.06079876e+0f; // -0x1.0f9082p+0
    const float c2 =  1.48817670e+0f; //  0x1.7cf926p+0
    const float c3 = -1.18180847e+0f; // -0x1.2e8b00p+0
    const float c4 =  1.42678642e+0f; //  0x1.6d41e0p+0
    const float c5 =  1.98969673e-2f; //  0x1.45fdeep-6
    const float c6 = -7.45610159e-5f; // -0x1.38bb48p-14

    /* compute x**0.25 */
    x = sqrtf (sqrtf (x));
    /* compute (x**0.25)**(20/11) = x**(1/2.2) */
    return (((((c0 * x + c1) * x + c2) * x + c3) * x + c4) * x + c5) * x + c6;
}

OTHER TIPS

You achieve the best approximation of a continuous function by a polynomial of degree n if the error has n+2 minima / maxima of same magnitude with opposite sign. A polynomial interpolating at n+1 points has degree n. So if you think a cubic polynomial might be good enough, create a spreadsheet where you enter 4 interpolation points, calculate the interpolation polynomial and the error, draw a graph of the error, then move the interpolation points around until the error has five minima / maxima with same value and opposite sign.

If you want to keep the relative error small, approximate f(x) /x.

For f(x) = $x^{1/{2.2}}$, calculate it as $x \cdot x^{-1/2} \cdot x^{1/{22}}$.

Most processors provide a fast approximation to x^-1/2 which can be improved enough with one step of Newton iteration very quickly. And x^1/22 is quite smooth. Remember you don’t have to limit the error in x^1/22 but in the product.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top