Question

I have an expensive for loop that is taking more time than it should due to dynamic casting overhead inside the loop.

An example code would be the following (compilable)

#include <iostream>
#include <memory>

struct Base {
  virtual ~Base() {}
};

struct DerivedA : Base {};
struct DerivedB : Base {};

struct Calculator {
  virtual void proceed(const DerivedA& ) const  {
    std::cout << "doing A stuff" << std::endl;
  }
  virtual void proceed(const DerivedB& ) const  {
    std::cout << "doing B stuff" << std::endl;
  }
};

void doStuff(const std::shared_ptr<Base> &base_ptr) {
  Calculator calc;
  // Code that does stuff using only Base properties
  for(int i = 0; i < 1000000; i++) { // expensive loop
    // "redundant" dynamic cast at every iteration
    auto a_ptr = std::dynamic_pointer_cast<DerivedA>(base_ptr);
    if(a_ptr) calc.proceed(*a_ptr);
    auto b_ptr = std::dynamic_pointer_cast<DerivedB>(base_ptr);
    if(b_ptr) calc.proceed(*b_ptr);
  }
}

int main() {
  std::shared_ptr<Base> base_ptr = std::make_shared<DerivedA>();
  doStuff(base_ptr);
}

Since the class is not changing inside the function, I think that there has to be a way to avoid the polymorhpism overhead (and branching) inside the loop and perform a single dynamic cast and a single function call without having to write the loop multiple times.

What I considered:

  1. Cast inside the proceed call.
  2. Visitor pattern.

I don't think that any of them solves the problem. Those are just different ways of doing the same thing.

I'm considering to rethink my design, but before that I would be happy to hear any ideas and suggestions that you may have to improve this piece of code.

Was it helpful?

Solution

First of all I would rather do as Baldrick suggests in a comment to the OP then I would try the other alternatives cited in the OP. I would everytime profiling/mesuring the results to make an informed decision.

If you are not yet satisfied, then I suggest something along these lines:

template <typename T>
void doStuffImpl(const T &obj) {
    Calculator calc;
    for(int i = 0; i < 1000000; i++)
        calc.proceed(obj);
}

void doStuff(const std::shared_ptr<Base> &base_ptr) {

    auto a_ptr = std::dynamic_pointer_cast<DerivedA>(base_ptr);
    if (a_ptr)
        doStuffImpl(*a_ptr);

    auto b_ptr = std::dynamic_pointer_cast<DerivedB>(base_ptr);
    if (b_ptr)
        doStuffImpl(*b_ptr);
}

OTHER TIPS

Sometimes it helps to optimize if you know your data.

If you know that your data contains mostly type A, then make sure this will be the first statement in your if list of your iteration. I am not sure though, if this is possible with dynamic objects.

Maybe you can exclude some calls to be iterated by using if-then-else statement inside the iteration? Is it really necessary to do always both dynamic pointer casts (a_ptr, b_ptr), or can there be exceptions reducing the workload?

Does this help you?

You could enumerate your types of derived class and so avoid the double-dispatch - implement some GetClass() as a pure virtual on Base

void doStuff(const std::shared_ptr<Base> &base_ptr) {
Calculator calc;
// Code that does stuff using only Base properties
for(int i = 0; i < 1000000; i++) { // expensive loop

  switch(base_ptr->GetClass())
  {
    case TYPE_A:
    // "redundant" dynamic cast at every iteration
    auto a_ptr = std::dynamic_pointer_cast<DerivedA>(base_ptr);
    if(a_ptr) 
      calc.proceed(*a_ptr);
    break;

    case TYPE_B:
    auto b_ptr = std::dynamic_pointer_cast<DerivedB>(base_ptr);
    if(b_ptr)
      calc.proceed(*b_ptr);
    break;
  }
}

int main() {
  std::shared_ptr<Base> base_ptr = std::make_shared<DerivedA>();
  doStuff(base_ptr);
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top