C++ algorithms that create their output-storage instead of being applied to existing storage?

https://stackoverflow.com/questions/7203278

13-01-2021
|

Question

The C++ std algorithms define a number of algorithms that take an input and an output sequence, and create the elements of the output sequence from the elements of the input sequence. (Best example being std::transform.)

The std algorithms obviously take iterators, so there's no question that the container for the OutputIterator has to exist prior to the algorithm being invoked.

That is:

std::vector<int> v1; // e.g. v1 := {1, 2, 3, 4, 5};

std::vector<int> squared;
squared.reserve(v1.size()); // not strictly necessary
std::transform(v1.begin(), v1.end(), std::back_inserter(squared), 
               [](int x) { return x*x; } ); // λ for convenience, needn't be C++11

And this is fine as far as the std library goes. When I find iterators too cumbersome, I often look to Boost.Range to simplify things.

In this case however, it seems that the mutating algorithms in Boost.Range also use OutputIterators.

So I'm currently wondering whether there's any convenient library out there, that allows me to write:

std::vector<int> const squared = convenient::transform(v1, [](int x) { return x*x; });

-- and if there is none, whether there is a reason that there is none?

Edit: example implementation (not sure if this would work in all cases, and whether this is the most ideal one):

template<typename C, typename F>
C transform(C const& input, F fun) {
   C result;
   std::transform(input.begin(), input.end(), std::back_inserter(result), fun);
   return result;
}

(Note: I think convenient::transform will have the same performance characteristics than the handwritten one, as the returned vector won't be copied due to (N)RVO. Anyway, I think performance is secondary for this question.)

Edit/Note: Of the answers(comments, really) given so far, David gives a very nice basic generic example.

And Luc mentions a possible problem with std::back_inserter wrt. genericity.

Both just go to show why I'm hesitating to whip this up myself and why a "proper" (properly tested) library would be preferable to coding this myself.

My question phrased in bold above, namely is there one, or is there a reason there is none remains largely unanswered.

Solution

The Boost.Range.Adaptors can be kind of seen as container-returning algorithms. Why not use them?

The only thing that needs to be done is to define a new range adaptor create<T> that can be piped into the adapted ranges and produces the desired result container:

template<class T> struct converted{}; // dummy tag class

template<class FwdRange, class T>
T operator|(FwdRange const& r, converted<T>){
  return T(r.begin(), r.end());
}

Yep, that's it. No need for anything else. Just pipe that at the end of your adaptor list.

Here could be a live example on Ideone. Alas, it isn't, because Ideone doesn't provide Boost in C++0x mode.. meh. In any case, here's main and the output:

int main(){
  using namespace boost::adaptors;
  auto range = boost::irange(1, 10);
  std::vector<int> v1(range.begin(), range.end());

  auto squared = v1 | transformed([](int i){ return i * i; });
  boost::for_each(squared, [](int i){ std::cout << i << " "; });
  std::cout << "\n========================\n";
  auto modded = squared | reversed
                        | filtered([](int i){ return (i % 2) == 0; })
                        | converted<std::vector<int>>(); // gimme back my vec!
  modded.push_back(1);
  boost::for_each(modded, [](int i){ std::cout << i << " "; });
}

Output:

1 4 9 16 25 36 49 64 81
========================
64 36 16 4 1

OTHER TIPS

This is not meant as an answer to the question itself, it's a complement to the other answers -- but it wouldn't fit in the comments.

well - what if you wanted list or deque or some other sequence type container - it's pretty limiting.

namespace detail {

template<typename Iter, typename Functor>
struct transform {
    Iter first, last;
    Functor functor;

    template<typename Container> // SFINAE is also available here
    operator Container()
    {
        Container c;
        std::transform(first, last, std::back_inserter(c), std::forward<Functor>(functor));
        return c;
    }
};

} // detail

template<typename Iter, typename Functor>
detail::transform<Iter, typename std::decay<Functor>::type>
transform(Iter first, Iter last, Functor&& functor)
{ return { first, last, std::forward<Functor>(functor) }; }

While this would work with a handful of containers, it's still not terribly generic since it requires that the container be 'compatible' with std::back_inserter(c) (BackInsertable?). Possibly you could use SFINAE to instead use std::inserter with c.begin() if c.push_back() is not available (left as an exercise to the reader).

All of this also assume that the container is DefaultConstructible -- consider containers that make use of scoped allocators. Presumably that loss of genericity is a feature, as we're only trying to cover the 'simplest' uses.

And this is in fact while I would not use such a library: I don't mind creating the container just outside next to the algorithm to separate the concerns. (I suppose this can be considered my answer to the question.)

IMHO, the point of such an algorithm is to be generic, i.e. mostly container agnostic. What you are proposing is that the transform function be very specific, and return a std::vector, well - what if you wanted list or deque or some other sequence type container - it's pretty limiting.

Why not wrap if you find it so annoying? Create your own little utilities header which does this - after all, it's pretty trivial...

There is no one and correct way of enabling

std::vector<int> const squared = 
             convenient::transform(v1, [](int x) { return x*x; });

without a potential performance cost. You either need an explicit

std::vector<int> const squared = 
             convenient::transform<std::vector> (v1, [](int x) { return x*x; });

Note the explicit mentioning of the container type: Iterators don't tell anything about the container they belong to. This becomes obvious if you remind that a container's iterator is allowed by the standard to be an ordinary pointer.

Letting the algorithm take a container instead of iterators is not a solution, either. That way, the algorithm can't know how to correctly get the first and last element. For example, a long int-array does not have methods for begin(), end() and length(), not all containers have random access iterators, not operator[] defined. So there is no truly generic way to take containers.

Another possibility that allows for container-agnostic, container-returning algorithms would be some kind of generic factory (see live at http://ideone.com/7d4E2):

// (not production code; is even lacking allocator-types)
//-- Generic factory. -------------------------------------------
#include <list>
template <typename ElemT, typename CacheT=std::list<ElemT> >
struct ContCreator {

    CacheT cache; // <-- Temporary storage.

    // Conversion to target container type.
    template <typename ContT>
    operator ContT () const {
        // can't even move ...
        return ContT (cache.begin(), cache.end());
    }
};

Not so much magic there apart from the templated cast operator. You then return that thing from your algorithm:

//-- A generic algorithm, like std::transform :) ----------------
ContCreator<int> some_ints () {
    ContCreator<int> cc;
    for (int i=0; i<16; ++i) {
        cc.cache.push_back (i*4);
    }
    return cc;
}

And finally use it like this to write magic code:

//-- Example. ---------------------------------------------------
#include <vector>
#include <iostream>
int main () {
    typedef std::vector<int>::iterator Iter;
    std::vector<int> vec = some_ints();
    for (Iter it=vec.begin(), end=vec.end(); it!=end; ++it) {
        std::cout << *it << '\n';
    }
}

As you see, in operator T there's a range copy.

A move might be possible by means of template specialization in case the target and source containers are of the same type.

Edit: As David points out, you can of course do the real work inside the conversion operator, which will come at probably no extra cost (with some more work it can be done more convenient; this is just for demonstration):

#include <list>
template <typename ElemT, typename Iterator>
struct Impl {
    Impl(Iterator it, Iterator end) : it(it), end(end) {}

    Iterator it, end;

    // "Conversion" + Work.
    template <typename ContT>
    operator ContT () {
        ContT ret;
        for ( ; it != end; ++it) {
            ret.push_back (*it * 4);
        }
        return ret;    
    }
};

template <typename Iterator>
Impl<int,Iterator> foo (Iterator begin, Iterator end) {
    return Impl<int,Iterator>(begin, end);
}

#include <vector>
#include <iostream>
int main () {
    typedef std::vector<int>::iterator Iter;

    const int ints [] = {1,2,4,8};
    std::vector<int> vec = foo (ints, ints + sizeof(ints) / sizeof(int));

    for (Iter it=vec.begin(), end=vec.end(); it!=end; ++it) {
        std::cout << *it << '\n';
    }
}

The one requirement is that the target has a push_back method. Using std::distance to reserve a size may lead to sub-optimal performance if the target-container-iterator is not a random-access one.

Again, a no-answer, but rather a follow up from the comments to another answer

On the genericity of the returned type in the questions code

The code as it stands does not allow the conversion of the return type, but that can be easily solvable by providing two templates:

template <typename R, typename C, typename F>
R transform( C const & c, F f ) {_
    R res;
    std::transform( c.begin(), c.end(), std::back_inserter(res), f );
    return res;
}
template <typename C, typename F>
C transform( C const & c, F f ) {
    return transform<C,C,F>(c,f);
}
std::vector<int> src;
std::vector<int> v = transform( src, functor );
std::deque<int>  d = transform<std::deque<int> >( src, functor );

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow