Why is there no transform_if in the C++ standard library?

Question 1

The standard library favours elementary algorithms.

Containers and algorithms should be independent of each other if possible.

Likewise, algorithms that can be composed of existing algorithms are only rarely included, as shorthand.

If you require a transform if, you can trivially write it. If you want it /today/, composing of ready-mades and not incur overhead, you can use a range library that has lazy ranges, such as Boost.Range, e.g.:

v | filtered(arg1 % 2) | transformed(arg1 * arg1 / 7.0)

As @hvd points out in a comment, transform_if double result in a different type (double, in this case). Composition order matters, and with Boost Range you could also write:

 v | transformed(arg1 * arg1 / 7.0) | filtered(arg1 < 2.0)

resulting in different semantics. This drives home the point:

it makes very little sense to include std::filter_and_transform, std::transform_and_filter, std::filter_transform_and_filter etc. etc. into the standard library.

See a sample Live On Coliru

#include <boost/range/algorithm.hpp>
#include <boost/range/adaptors.hpp>

using namespace boost::adaptors;

// only for succinct predicates without lambdas
#include <boost/phoenix.hpp>
using namespace boost::phoenix::arg_names;

// for demo
#include <iostream>

int main()
{
    std::vector<int> const v { 1,2,3,4,5 };

    boost::copy(
            v | filtered(arg1 % 2) | transformed(arg1 * arg1 / 7.0),
            std::ostream_iterator<double>(std::cout, "\n"));
}

Question 2

The new for loop notation in many ways reduces the need for algorithms that access every element of the collection where it is now cleaner to just write a loop and put the logic inplace.

std::vector< decltype( op( begin(coll) ) > output;
for( auto const& elem : coll )
{
   if( pred( elem ) )
   {
        output.push_back( op( elem ) );
   }
}

Does it really provide a lot of value now to put in an algorithm? Whilst yes, the algorithm would have been useful for C++03 and indeed I had one for it, we don't need one now so no real advantage in adding it.

Note that in practical use your code won't always look exactly like that either: you don't necessarily have functions "op" and "pred" and may have to create lambdas to make them "fit" into algorithms. Whilst it is nice to separate out concerns if the logic is complex, if it is just a matter of extracting a member from the input type and checking its value or adding it to the collection, it's a lot simpler once again than using an algorithm.

In addition, once you are adding some kind of transform_if, you have to decide whether to apply the predicate before or after the transform, or even have 2 predicates and apply it in both places.

So what are we going to do? Add 3 algorithms? (And in the case that the compiler could apply the predicate on either end of the convert, a user could easily pick the wrong algorithm by mistake and the code still compile but produce wrong results).

Also, if the collections are large, does the user want to loop with iterators or map/reduce? With the introduction of map/reduce you get even more complexities in the equation.

Essentially, the library provides the tools, and the user is left here to use them to fit what they want to do, not the other way round as was often the case with algorithms. (See how the user above tried to twist things using accumulate to fit what they really wanted to do).

For a simple example, a map. For each element I will output the value if the key is even.

std::vector< std::string > valuesOfEvenKeys
    ( std::map< int, std::string > const& keyValues )
{
    std::vector< std::string > res;
    for( auto const& elem: keyValues )
    {
        if( elem.first % 2 == 0 )
        {
            res.push_back( elem.second );
        }
    }
    return res;
}

Nice and simple. Fancy fitting that into a transform_if algorithm?

Question 3

Sorry to resurrect this question after so long. I had a similar requirement recently. I solved it by writing a version of back_insert_iterator that takes a boost::optional:

template<class Container>
struct optional_back_insert_iterator
: public std::iterator< std::output_iterator_tag,
void, void, void, void >
{
    explicit optional_back_insert_iterator( Container& c )
    : container(std::addressof(c))
    {}

    using value_type = typename Container::value_type;

    optional_back_insert_iterator<Container>&
    operator=( const boost::optional<value_type> opt )
    {
        if (opt) {
            container->push_back(std::move(opt.value()));
        }
        return *this;
    }

    optional_back_insert_iterator<Container>&
    operator*() {
        return *this;
    }

    optional_back_insert_iterator<Container>&
    operator++() {
        return *this;
    }

    optional_back_insert_iterator<Container>&
    operator++(int) {
        return *this;
    }

protected:
    Container* container;
};

template<class Container>
optional_back_insert_iterator<Container> optional_back_inserter(Container& container)
{
    return optional_back_insert_iterator<Container>(container);
}

used like this:

transform(begin(s), end(s),
          optional_back_inserter(d),
          [](const auto& s) -> boost::optional<size_t> {
              if (s.length() > 1)
                  return { s.length() * 2 };
              else
                  return { boost::none };
          });

Question 4

After just finding this question again after some time, and devising a whole slew of potentially useful generic iterator adaptors I realized that the original question required NOTHING more than std::reference_wrapper.

Use it instead of a pointer, and you're good:

Live On Coliru

#include <algorithm>
#include <functional> // std::reference_wrapper
#include <iostream>
#include <vector>

struct ha {
    int i;
};

int main() {
    std::vector<ha> v { {1}, {7}, {1}, };

    std::vector<std::reference_wrapper<ha const> > ph; // target vector
    copy_if(v.begin(), v.end(), back_inserter(ph), [](const ha &parg) { return parg.i < 2; });

    for (ha const& el : ph)
        std::cout << el.i << " ";
}

Prints

1 1

Question 5

C++20 brought ranges and with them a new set of algorithms to operate on them. One of the most powerful tools in this addition is that of views:

They support lazy evaluation, which means elements are generated upon request and not upon construction. So performance considerations are put to rest (the original question mentions how creating temporary vectors with intermediate results is sub-optimal).
They are composable, which means that operations can easily chained together without loss of performance or expressiveness.

Armed with those new tools, a transform if operation to:

"transform a vector v using function A
only if an element satisfies condition B

becomes as simple as:

v | std::views::filter(B) | std::views::transform(A)

It's now fair to say that there is a pretty straight-forward way to do "transform if" using the Standard library.

What was originally asked can be written as:

struct ha { 
    int i;
    explicit ha(int a) : i(a) {}
};

int main() 
{
    std::vector<ha> v{ ha{1}, ha{7}, ha{1}, ha{4}, ha{3}, ha{0} };

    auto less4 =  [](ha const& h) { return h.i < 4; };
    auto pnter =  [](ha const& h) { return std::addressof(h); };
 
    for (auto vp : v | std::views::filter(less4) 
                     | std::views::transform(pnter)) 
    {
        std::cout << vp->i << ' ';
    }    
}

Demo

Question 6

The standard is designed in such a way as to minimise duplication.

In this particular case you can achieve the algoritm's aims in a more readable and succinct way with a simple range-for loop.

// another way

vector<ha*> newVec;
for(auto& item : v) {
    if (item.i < 2) {
        newVec.push_back(&item);
    }
}

I have modified the example so that it compiles, added some diagnostics and presented both the OP's algorithm and mine side by side.

#include <vector>
#include <algorithm>
#include <iostream>
#include <iterator>

using namespace std;

struct ha { 
    explicit ha(int a) : i(a) {}
    int i;   // added this to solve compile error
};

// added diagnostic helpers
ostream& operator<<(ostream& os, const ha& t) {
    os << "{ " << t.i << " }";
    return os;
}

ostream& operator<<(ostream& os, const ha* t) {
    os << "&" << *t;
    return os;
}

int main() 
{
    vector<ha> v{ ha{1}, ha{7}, ha{1} }; // initial vector
    // GOAL : make a vector of pointers to elements with i < 2
    vector<ha*> ph; // target vector
    vector<ha*> pv; // temporary vector
    // 1. 
    transform(v.begin(), v.end(), back_inserter(pv), 
        [](ha &arg) { return &arg; }); 
    // 2. 
    copy_if(pv.begin(), pv.end(), back_inserter(ph),
        [](ha *parg) { return parg->i < 2;  }); // 2. 

    // output diagnostics
    copy(begin(v), end(v), ostream_iterator<ha>(cout));
    cout << endl;
    copy(begin(ph), end(ph), ostream_iterator<ha*>(cout));
    cout << endl;


    // another way

    vector<ha*> newVec;
    for(auto& item : v) {
        if (item.i < 2) {
            newVec.push_back(&item);
        }
    }

    // diagnostics
    copy(begin(newVec), end(newVec), ostream_iterator<ha*>(cout));
    cout << endl;
    return 0;
}

Question 7

You may use copy_if along. Why not? Define OutputIt (see copy):

struct my_inserter: back_insert_iterator<vector<ha *>>
{
  my_inserter(vector<ha *> &dst)
    : back_insert_iterator<vector<ha *>>(back_inserter<vector<ha *>>(dst))
  {
  }
  my_inserter &operator *()
  {
    return *this;
  }
  my_inserter &operator =(ha &arg)
  {
    *static_cast< back_insert_iterator<vector<ha *>> &>(*this) = &arg;
    return *this;
  }
};

and rewrite your code:

int main() 
{
    vector<ha> v{ ha{1}, ha{7}, ha{1} }; // initial vector
    // GOAL : make a vector of pointers to elements with i < 2
    vector<ha*> ph; // target vector

    my_inserter yes(ph);
    copy_if(v.begin(), v.end(), yes,
        [](const ha &parg) { return parg.i < 2;  });

    return 0;
}

Question 8

template <class InputIt, class OutputIt, class BinaryOp>
OutputIt
transform_if(InputIt it, InputIt end, OutputIt oit, BinaryOp op)
{
    for(; it != end; ++it, (void) ++oit)
        op(oit, *it);
    return oit;
}

Usage: (Note that CONDITION and TRANSFORM are not macros, they are placeholders for whatever condition and transformation you want to apply)

std::vector a{1, 2, 3, 4};
std::vector b;

return transform_if(a.begin(), a.end(), b.begin(),
    [](auto oit, auto item)             // Note the use of 'auto' to make life easier
    {
        if(CONDITION(item))             // Here's the 'if' part
            *oit++ = TRANSFORM(item);   // Here's the 'transform' part
    }
);

Question 9

This is just an answer to question 1 "Is there a more elegant workaround with the available C++ standard library tools ?".

If you can use c++17 then you can use std::optional for a simpler solution using only C++ standard library functionality. The idea is to return std::nullopt in case there is no mapping:

See live on Coliru

#include <iostream>
#include <optional>
#include <vector>

template <
    class InputIterator, class OutputIterator, 
    class UnaryOperator
>
OutputIterator filter_transform(InputIterator first1, InputIterator last1,
                            OutputIterator result, UnaryOperator op)
{
    while (first1 != last1) 
    {
        if (auto mapped = op(*first1)) {
            *result = std::move(mapped.value());
            ++result;
        }
        ++first1;
    }
    return result;
}

struct ha { 
    int i;
    explicit ha(int a) : i(a) {}
};

int main()
{
    std::vector<ha> v{ ha{1}, ha{7}, ha{1} }; // initial vector

    // GOAL : make a vector of pointers to elements with i < 2
    std::vector<ha*> ph; // target vector
    filter_transform(v.begin(), v.end(), back_inserter(ph), 
        [](ha &arg) { return arg.i < 2 ? std::make_optional(&arg) : std::nullopt; });

    for (auto p : ph)
        std::cout << p->i << std::endl;

    return 0;
}

Note that I just implemented Rust's approach in C++ here.

Question 10

You can use std::accumulate which operates on a pointer to the destination container:

Live On Coliru

#include <numeric>
#include <iostream>
#include <vector>

struct ha
{
    int i;
};

// filter and transform is here
std::vector<int> * fx(std::vector<int> *a, struct ha const & v)
{
    if (v.i < 2)
    {
        a->push_back(v.i);
    }

    return a;
}

int main()
{
    std::vector<ha> v { {1}, {7}, {1}, };

    std::vector<int> ph; // target vector

    std::accumulate(v.begin(), v.end(), &ph, fx);
    
    for (int el : ph)
    {
        std::cout << el << " ";
    }
}

Prints

1 1

Question 11

The standard std::copy_if & std::transform support execution policies (e.g., std::execution::par_unseq), so a standard std::copy_if_and_transform would also do so and allow one to filter & transform in parallel, without having to create an intermediate sequence of elements (copy_if) and then transform that.

None of the "do it yourself" suggestions above seem to be able to do so.

So I too wonder why the standard didn't include a copy_if_and_transform algorithm. Nikos' answer above (https://stackoverflow.com/a/70523558/20396957) (which I like a lot, as it introduced me to ranges!) uses ranges to do this lazily. But "lazily" doesn't necessarily guarantee an execution policy - they could all be computed sequentially for all I know.

So, do we still need the copy_if_and_transform?

And is the newer standard (C++23) going to provide it?

(and same question for remove_if_and_transform while I'm at it, since one may want to do the filter/transform in place instead of constructing a new container)

EDIT: Here's code I've written to implement the (policy taking) copy_if_and_transform using the standard copy_if - hope it helps! I'd love to hear comments about it and how one can improve it (my generic programming skills are not very good).

Solution - What's the idea:

The copy_if uses *first1 twice - once to call pred() on it and the second time to assign it to *d_first. I want to be able to hijack the 2nd call, so as to call the transform operation. So I proxy the input iterator so that it returns a proxy_val instead. Then I wrap the pred so it can take a proxy_val and apply itself to the actual value. While proxy_val also offers a way to get the output iterator's element type, upon which it calls the transform operation.


#include <iostream>
#include <string>
#include <vector>
#include <functional>
#include <algorithm>
#include <execution>
#include <iterator>
#include <utility>

// Get the container element type from an iterator
template<class It, class Itvaluetype>
struct get_value_type {
    using value_type = std::iter_value_t<It>;
};

// Get the container element type from an inserter
template<class It> 
struct get_value_type<It, void> {
    using value_type = typename It::container_type::value_type ;
};

template< class ExecutionPolicy, class InputIt, class OutputIt,
        class UnaryPredicate, class UnaryOperation>
OutputIt copy_if_and_transform(ExecutionPolicy&& policy,
            InputIt first1, InputIt last1,
                    OutputIt d_first,
            UnaryPredicate pred,
            UnaryOperation unary_op) {
    if (first1 != last1) {

        using InputElementType
            = std::iterator_traits<InputIt>::value_type;
        using OutputElementType
            = get_value_type< OutputIt, typename std::iterator_traits< OutputIt > ::value_type >::value_type ;

    class proxy_val {
            UnaryOperation op;
        public:
            InputElementType val;
            proxy_val(const InputElementType &vl
                , UnaryOperation o
                ) : op(o) , val(vl) {}
            operator OutputElementType() const {return op(val);}
    };

    class proxy_InputIt {
        InputIt ii;
        UnaryOperation op;
        public:
        proxy_InputIt(const InputIt &an_in_it, UnaryOperation o)
            : ii(an_in_it) , op(o) {}
        proxy_InputIt &operator++() { ++ii; return *this; }
        proxy_InputIt operator++(int) { proxy_InputIt prev=*this; ++ii; return prev; }
        proxy_val operator*() {return {*ii, op};}
        bool operator==(const proxy_InputIt &o) const {return ii == o.ii;}
    };
    auto pr = [ &pred ]( const proxy_val &p ) {return pred(p.val);};
    d_first =
    std::copy_if(policy
            , proxy_InputIt(first1, unary_op)
            , proxy_InputIt(last1, unary_op)
            , d_first
            , pr
            );
    }
    return d_first;
}

// Test with iterator & inserter
int main() {
    std::vector<int> vi = {1, 2, 3, 4};
    std::vector<std::string> squares_of_odds(vi.size());

    auto e = 
        copy_if_and_transform(std::execution::par_unseq, 
            std::begin(vi), std::end(vi)
            , std::begin(squares_of_odds)
            , [](auto x) {return x%2;}
            , [](auto x) {return '|'+std::to_string(x*x)+'|';});

    std::cout << "Squares of odd\n";
    for ( auto f = begin(squares_of_odds); f != e; ++f )
        std::cout << (*f) << std::endl;

    std::vector<int> vib = {1, 2, 3, 4};
    std::vector<std::string> squares_of_even;

    copy_if_and_transform(std::execution::par_unseq, 
        std::begin(vib), std::end(vib)
        , std::back_inserter(squares_of_even)
        , [](auto x) {return 0==(x%2);}
        , [](auto x) {return '|' + std::to_string(x*x) + '|';} );

    std::cout << "Squares of even\n";
    for ( auto n : squares_of_even)
        std::cout << n << std::endl;

    return 0;
}