Question

I have come across the need to reorder a variadic list of parameters that is supplied to the constructor of a struct. After being reordered based on their types, the parameters will be stored as a tuple. My question is how this can be done so that a modern C++ compiler (e.g. g++-4.7) will not generate unnecessary load or store instructions. That is, when the constructor is invoked with a list of parameters of variable size, it efficiently pushes each parameter into place based on an ordering over the parameters' types.

Here is a concrete example. Assume that the base type of every parameter (without references, rvalue references, pointers, or qualifiers) is either char, int, or float. How can I make it so that all the parameters of base type char appear first, followed by all of those of base type int (which leaves the parameters of base type float last). The relative order in which the parameters were given should not be violated within sublists of homogeneous base type.

Example: foo::foo() is called with arguments float a, char&& b, const float& c, int&& d, char e. The tuple tupe is std::tuple<char, char, int, float, float>, and it is constructed like so: tuple_type{std::move(b), e, std::move(d), a, c}.

Consider the struct defined below, and assume that the metafunction deduce_reordered_tuple_type is already implemented. How would you write the constructor so that it works as intended? If you think that the code for deduce_reodered_tuple_type, would be useful to you, I can provide it; it's a little long.

template <class... Args> struct foo
{
    // Assume that the metafunction deduce_reordered_tuple_type is defined.
    typedef typename deduce_reordered_tuple_type<Args...>::type tuple_type;
    tuple_type t_;

    foo(Args&&... args) : t_{reorder_and_forward_parameters<Args>(args)...} {}
};

Edit 1 The technique I describe above does have applications in mathematical frameworks that make heavy use of expression templates, variadic templates, and metaprogramming in order to perform aggressive inlining. Suppose that you wish to define an operator that takes the product of several expressions, each of which may be passed by reference, reference to const, or rvalue reference. (In my case, the expressions are conditional probability tables and the operation is the factor product, but something like matrix multiplication works suitably as well.)

You need access to the data provided by each expression in order to evaluate the product. Consequently, you must move the expressions passed as rvalue references, copy the expressions passed by reference to const, and take the addresses of expressions passed by reference. Using the technique I describe above now poses several benefits.

  1. Other expressions can use uniform syntax to access data elements from this expression, since all of the heavy-lifting metaprogramming work is done beforehand, within the class.
  2. We can save stack space by grouping the pointers together and storing the larger expressions towards the end of the tuple.
  3. Implementing certain types of queries becomes much easier (e.g. check whether any of the pointers stored in the tuple aliases a given pointer).

Thank you very much for your help!

Was it helpful?

Solution

Happy 4th of July everyone! Ok, here you go.

It turns out that g++-4.7 is pretty awesome at iniling and creates identical machine code according to my testing (instructions to reproduce results are below the code).

#include <iostream>
#include <tuple>
#include <typeinfo>
#include <type_traits>

template <class... Args>
struct sequence
{
    typedef std::tuple<Args...> tuple_type;
};

template <class U, class V>
struct sequence_cat;

template <class... U, class... V>
struct sequence_cat<sequence<U...>, sequence<V...>>
{
    typedef sequence<U..., V...> type;
};

template <class... V>
struct sequence_cat<void, sequence<V...>>
{
    typedef sequence<V...> type;
};

template <class... U>
struct sequence_cat<sequence<U...>, void>
{
    typedef sequence<U...> type;
};

template <>
struct sequence_cat<void, void>
{
    typedef void type;
};

template <class T>
struct undecorate
{
    typedef typename std::remove_reference<T>::type noref_type;
    typedef typename std::remove_pointer<noref_type>::type noptr_type;
    typedef typename std::remove_cv<noptr_type>::type type;
};

template <class T>
struct deduce_storage_type
{
    typedef T type;
};

template <class T>
struct deduce_storage_type<T&>
{
    typedef T* type;
};

template <class T>
struct deduce_storage_type<const T&>
{
    typedef T type;
};

template <class T>
struct deduce_storage_type<T&&>
{
    typedef T type;
};

template <class T, class... Args>
struct filter_type;

template <class T, class Arg, class... Args>
struct filter_type<T, Arg, Args...>
{
    static constexpr bool pred = 
    std::is_same<typename undecorate<Arg>::type, T>::value;

    typedef typename deduce_storage_type<Arg>::type storage_type;

    typedef typename
    std::conditional<
        pred,
        typename sequence_cat<
            sequence<storage_type>,
            typename filter_type<T, Args...>::type
        >::type,
        typename filter_type<T, Args...>::type
    >::type type;       
};

template <class T, class Arg>
struct filter_type<T, Arg>
{
    static constexpr bool pred =
    std::is_same<typename undecorate<Arg>::type, T>::value;

    typedef typename deduce_storage_type<Arg>::type storage_type;

    typedef typename
    std::conditional<pred, sequence<storage_type>, void>::type
    type;
};

template <class... Args>
struct deduce_sequence_type
{
    typedef typename filter_type<char, Args...>::type char_sequence;
    typedef typename filter_type<int, Args...>::type int_sequence;
    typedef typename filter_type<float, Args...>::type float_sequence;

    typedef typename
    sequence_cat<
        char_sequence,
        typename sequence_cat<
            int_sequence,
            float_sequence
        >::type
    >::type type;
};

template <class T>
struct get_storage_type
{
    static T apply(T t) { return t; }
};

template <class T>
struct get_storage_type<T&>
{
    static T* apply(T& t) { return &t; }
};

template <class T>
struct get_storage_type<const T&>
{
    static T apply(const T& t) { return t; }
};

template <class T>
struct get_storage_type<T&&>
{
    static T&& apply(T&& t) { return std::move(t); }
};

template <class T, class Arg>
struct equals_undecorated_type
{
    static constexpr bool value =
    std::is_same<typename undecorate<Arg>::type, T>::value;
};

template <bool Pred, bool IsNextVoid, class T, class... Args>
struct filter_parameter_impl;

template <class T, class Arg1, class Arg2, class... Args>
struct filter_parameter_impl<false, false, T, Arg1, Arg2, Args...>
{
    typedef typename filter_type<T, Arg2, Args...>::type sequence_type;
    typedef typename sequence_type::tuple_type tuple_type;

    static constexpr bool pred = equals_undecorated_type<T, Arg2>::value;

    static constexpr bool is_next_next_void =
    std::is_same<typename filter_type<T, Args...>::type, void>::value;

    static tuple_type apply(Arg1&&, Arg2&& arg2, Args&&... args)
    {
        return filter_parameter_impl<
            pred, is_next_next_void, T, Arg2, Args...
        >::apply(
            std::forward<Arg2>(arg2),
            std::forward<Args>(args)...
        );
    }
};

template <class T, class Arg1, class Arg2, class... Args>
struct filter_parameter_impl<false, true, T, Arg1, Arg2, Args...>
{
    static void apply(Arg1&&, Arg2&&, Args&&...) {}
};

template <class T, class Arg1, class Arg2, class... Args>
struct filter_parameter_impl<true, false, T, Arg1, Arg2, Args...>
{
    typedef typename
    filter_type<T, Arg1, Arg2, Args...>::type
    sequence_type;

    typedef typename sequence_type::tuple_type tuple_type;

    static constexpr bool pred = equals_undecorated_type<T, Arg2>::value;

    static constexpr bool is_next_next_void =
    std::is_same<typename filter_type<T, Args...>::type, void>::value;

    static tuple_type apply(Arg1&& arg1, Arg2&& arg2, Args&&... args)
    {
        return std::tuple_cat(
            std::make_tuple(get_storage_type<Arg1>::apply(arg1)),
            filter_parameter_impl<
                pred, is_next_next_void, T, Arg2, Args...
            >::apply(
                std::forward<Arg2>(arg2),
                std::forward<Args>(args)...
            )
        );
    }
};

template <class T, class Arg1, class Arg2, class... Args>
struct filter_parameter_impl<true, true, T, Arg1, Arg2, Args...>
{
    typedef typename filter_type<T, Arg1>::type sequence_type;
    typedef typename sequence_type::tuple_type tuple_type;

    static tuple_type apply(Arg1&& arg1, Arg2&&, Args&&...)
    {
        return std::make_tuple(get_storage_type<Arg1>::apply(
            std::forward<Arg1>(arg1)
        ));
    }
};

template <class T, class Arg1, class Arg2>
struct filter_parameter_impl<false, false, T, Arg1, Arg2>
{
    typedef typename filter_type<T, Arg2>::type sequence_type;
    typedef typename sequence_type::tuple_type tuple_type;

    static tuple_type apply(Arg1&&, Arg2&& arg2)
    {
        return std::make_tuple(get_storage_type<Arg2>::apply(
            std::forward<Arg2>(arg2)
        ));
    }
};

template <class T, class Arg1, class Arg2>
struct filter_parameter_impl<false, true, T, Arg1, Arg2>
{
    static void apply(Arg1&&, Arg2&&) {}
};

template <class T, class Arg1, class Arg2>
struct filter_parameter_impl<true, false, T, Arg1, Arg2>
{
    typedef typename filter_type<T, Arg1>::type sequence_type;
    typedef typename sequence_type::tuple_type tuple_type;

    static tuple_type apply(Arg1&& arg1, Arg2&& arg2)
    {
        return std::make_tuple(
            get_storage_type<Arg1>::apply(std::forward<Arg1>(arg1)),
            get_storage_type<Arg2>::apply(std::forward<Arg2>(arg2))
        );
    }
};

template <class T, class Arg1, class Arg2>
struct filter_parameter_impl<true, true, T, Arg1, Arg2>
{
    typedef typename filter_type<T, Arg1, Arg2>::type sequence_type;
    typedef typename sequence_type::tuple_type tuple_type;

    static tuple_type apply(Arg1&& arg1, Arg2&&)
    {
        return std::make_tuple(
            get_storage_type<Arg1>::apply(std::forward<Arg1>(arg1))
        );
    }
};

template <class T, class... Args>
struct filter_parameter;

template <class T, class Arg, class... Args>
struct filter_parameter<T, Arg, Args...>
{
    typedef typename filter_type<T, Arg, Args...>::type sequence_type;

    typedef typename std::conditional<
        std::is_same<sequence_type, void>::value,
        void,
        typename sequence_type::tuple_type
    >::type tuple_type;

    static constexpr bool pred = equals_undecorated_type<T, Arg>::value;

    static constexpr bool is_next_void =
    std::is_same<typename filter_type<T, Args...>::type, void>::value;

    static tuple_type apply(Arg&& arg, Args&&... args)
    {
        return filter_parameter_impl<
            pred, is_next_void, T, Arg, Args...
        >::apply(std::forward<Arg>(arg), std::forward<Args>(args)...);
    }
};

template <bool Is1Void, bool Is2Void, bool Is3Void, class... Args>
struct get_tuple_impl;

template <class... Args>
struct get_tuple_impl<false, false, false, Args...>
{
    typedef typename deduce_sequence_type<Args...>::type sequence_type;
    typedef typename sequence_type::tuple_type tuple_type;

    static tuple_type apply(Args&&... args)
    {
        return std::tuple_cat(
            filter_parameter<char, Args...>::apply(
                std::forward<Args>(args)...
            ),
            filter_parameter<int, Args...>::apply(
                std::forward<Args>(args)...
            ),
            filter_parameter<float, Args...>::apply(
                std::forward<Args>(args)...
            )
        );
    }
};

template <class... Args>
struct get_tuple
{
    typedef typename deduce_sequence_type<Args...>::type sequence_type;

    typedef typename std::conditional<
        std::is_same<sequence_type, void>::value,
        void,
        typename sequence_type::tuple_type
    >::type tuple_type;

    static constexpr bool is1void =
    std::is_same<typename filter_type<char, Args...>::type, void>::value;
    static constexpr bool is2void =
    std::is_same<typename filter_type<int, Args...>::type, void>::value;
    static constexpr bool is3void =
    std::is_same<typename filter_type<float, Args...>::type, void>::value;

    static tuple_type apply(Args&&... args)
    {
        return get_tuple_impl<is1void, is2void, is3void, Args...>::
            apply(std::forward<Args>(args)...);
    }
};

template <class... Args>
struct foo
{
    typedef typename deduce_sequence_type<Args...>::type sequence_type;
    typedef typename sequence_type::tuple_type tuple_type;

    tuple_type t_;

    foo(Args&&... args) : t_{} {}
};

int main()
{
    char a = 5;
    const int b = 6;
    float c = 7;
    int d = 8;
    float e = 9;
    char f = 10;

    auto x = get_tuple<char&, const int&, float&, int&, float&&, char&>::
        apply(a, b, c, d, std::move(e), f);
    //std::tuple<char*, char*, int, int*, float*, float> x{&a, &f, b, &d, &c, std::move(f)};

    std::cout << typeid(x).name() << std::endl;

    return 0;
}

In order to reproduce the results, do the following (assuming you have g++-4.7 installed).

g++ -std=c++11 -Wall -Ofast -march=native variadic_reorder.cpp -S -o with_templates.s
// Comment out the line in main, and comment the line containing auto x, as well as the line below it.
g++ -std=c++11 -Wall -Ofast -march=native variadic_reorder.cpp -S -o without_templates.s
vimdiff with_templates.s without_templates.s

The only differences I noticed were things like the names of labels; otherwise, the generated machine code was identical.

Edit 1 I'm going to accept my own answer until someone else posts something more elegant than what I have.

OTHER TIPS

I don't have time to experiment with this, but if your compiler is generating too many moves when permuting forwarded arguments, I would recommend forwarding through std::forward_as_tuple. The tuple is a data structure with a concrete layout, and constructing it encourages the compiler to put things into memory all at once, in the order you want.

On the other hand, it is less likely to promote arguments to registers and bypass the stack for simple functions. And nothing is really guaranteed as long as the tuple is only used as a pure value (no reference is taken), because under the as-if rule there is no need for its members to have any addresses.

We can save stack space by grouping the pointers together and storing the larger expressions towards the end of the tuple.

Lvalue references are implemented as pointers by the ABI yet you specified to group them as data values. Rvalue references should be treated the same as lvalue references if you want to adhere to the native passing semantics. (I assume you will only be moving class types.) So the sorting problem is slightly more complicated than stated, because you want to sort the parameters into value and pointer categories, then sort those by base type.

As for the sorting algorithm itself, I would try just popping from the input pack and pushing to a set of output tuples, queue-style, and then catenating the output tuples with std::tuple_cat. That will be the simplest to implement, stable, and should hit the compiler's common-case optimizations. Don't implement an algorithm intended to run in-place in memory because TMP doesn't work like that.

As for translating the results of sorting into the function that permutes the parameters into the arguments to forward_as_tuple, I'm not so sure. You'll probably have to deal with indexes.

You want to be very sure the benefit is worth it before committing to all this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top