Question

Which is better for program speed after compiler-optimization: return-by-value, or return-by-reference to a persistent object?

/// Generate a 'foo' value directly as a return type.
template< typename T >
inline T gen_foo();

/// Get a 'foo' reference of a persistent object.
template< typename T >
inline T const& get_foo();

T will be primitives, pointers, member-pointers, or user-defined small, P.O,D.-like data.

To the best of my knowledge it is pass-by-value, but there is a possible case for pass-by-reference:

  • pass-by-value:

    • returning one T is a smaller object and fast to copy into a caller's variable.
    • optimizer can use (N)RVO and copy-elision to remove return copies.
    • optimizer can inline the generating code or the generated value into the caller's code.
    • program will not need to access RAM, cached or not.
  • pass-by-reference:

    • optimizer might evaluate the persistent value fully, and replace its use with a literal equivalent. Whether or not this occurs affects the rest of the analysis.
    • if the persistent value is fully-evaluated and substituted as a literal:
      • no value to return.
      • optimizer can inline the literal easily.
      • program won't need to access RAM, cached or not.
    • if the persistent value can't be fully evaluated and substituted:
      • returning one reference is a small object and fast into copy to a caller's variable.
      • optimizer can use (N)RVO and copy-elision to avoid return copies.
      • optimizer can't inline the generating code or the generated value into the caller's code.
      • program would need to access RAM, although this likely would be in L1/L2/etc. cache.

Background:

I'm being forced to consider this because on some platforms, some floating-point exceptions get triggered if I return-by-value, but are not if I fill-by-parameter-reference. ( This is a given; this question is not to debate this point. ) So, the API I wanted, and the API I'm forced to consider using are:

/// Generate a 'foo' value directly as a return type.
template< typename T >
inline T gen_foo();

/// Fill in a 'foo' passed in by reference.
template< typename T >
inline void fill_foo( T& r_foo );

Since, I abhor the 'fill' API, ( because it separates definition from initialization, prevents creating temporaries, etc., ) I can transform that into a return-by-reference version instead, something like:

/// Forward-declare 'Initialized_Foo'.
template< typename T > struct Initialized_Foo;

/// Get a 'foo' reference; this returns a persistent reference to a static object.
template< typename T >
inline T const& get_foo()
{
    #if 0
    // BAD: This calls 'fill_foo' *every* time, and breaks const-correctness.
    thread_local static const T foo;
    fill_foo( const_cast< T& >( foo ) );
    return foo;
    #else
    // GOOD: This calls 'fill_foo' only *once*, and honours const-correctness.
    thread_local static const Initialized_Foo< T > initialized_foo;
    return initialized_foo.data;
    #endif
}

/// A 'foo' initializer to call 'fill_foo' at construction time.
template< typename T >
struct Initialized_Foo
{
    T data;
    Initialized_Foo()
    {
        fill_foo( data );
    }
};

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top