Question

I was writing some templated code to benchmark a numeric algorithm using both floats and doubles, in order to compare against a GPU implementation.

I discovered that my floating point code was slower and after investigating using Vtune Amplifier from Intel I discovered that g++ was generating extra x86 instructions (cvtps2pd/cvtpd2ps and unpcklps/unpcklpd) to convert some intermediate results from float to double and then back again. The performance degradation is almost 10% for this application.

After compiling with the flag -Wdouble-promotion (which BTW is not included with -Wall or -Wextra), sure enough g++ warned me that the results were being promoted.

I reduced this to a simple test case shown below. Note that the ordering of the c++ code affects the generated code. The compound statement (T d1 = log(r)/r;) produces a warning, whilst the separated version does not (T d = log(r); d/=r;).

The following was compiled with both g++-4.6.3-1ubuntu5 and g++-4.7.3-2ubuntu1~12.04 with the same results.

Compile flags are:

g++-4.7 -O2 -Wdouble-promotion -Wextra -Wall -pedantic -Werror -std=c++0x test.cpp -o test

#include <cstdlib>
#include <iostream>
#include <cmath>

template <typename T>
T f()
{
        T r = static_cast<T>(0.001);

        // Gives no double promotion warning
        T d = log(r);
        d/=r;
        // Promotes to double
        T d1 = log(r)/r;

        return d+d1;
}

int main()
{
        float f1 = f<float>();
        std::cout << f1 << std::endl;
}

I realise that the c++11 standard allows the compiler discretion here. But why does the order matter?

Can I explicitly instruct g++ to use floats only for this calculation?

EDIT: SOLVED by Mike Seymour. Needed to use std::log to ensure picking up the overloaded version of log instead of calling the C double log(double). The warning was not generated for the separated statement because this is a conversion and not a promotion.

Was it helpful?

Solution

The problem is

log(r)

In this implementation, it seems that the only log in the global namespace is the C library function, double log(double). Remember that it's not specified whether or not the C-library headers in the C++ library dump their definitions into the global namespace as well as namespace std.

You want

std::log(r)

to ensure that the extra overloads defined by the C++ library are available.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top