I'm writing an ODE integrator that evaluates dy/dx at each step and doesn't need it afterwards. it seems that it would be faster to allocate the space only ones and just use that space so that I don't spend time allocating a new dydx vector. does compiler optimize this? In other words, which one is better?

1)

vector<double> dydx(const vector<double>&x) {
  vector<double> dydx_tmp(x.size());
  for(size_t i = 0; i < x.size()/2; ++i) {
    dydx_tmp[2*i] = -x[2*i+1]; 
    dydx_tmp[2*i+1] = x[2*i];
  } 
  return dydx_tmp;
}

or 2), where dydx is already allocated and just needs an update

void update_dydx(vector<double> & dydx, const vector<double> &x) {
  for(size_t i = 0; i < x.size()/2; ++i) {
    dydx[2*i] = -x[2*i+1]; 
    dydx[2*i+1] = x[2*i];
  } 
}

there is also a case of 3)

vector<double> dydx_by_v(vector<double> x) {
  vector<double> dydx_tmp(x.size());
  for(size_t i = 0; i < x.size()/2; ++i) {
    dydx_tmp[2*i] = -x[2*i+1]; 
    dydx_tmp[2*i+1] = x[2*i];
  } 
  return dydx_tmp;
}

that follows http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/ but in this case, it doesn't matter because x's memory is used later for output so it won't be used by compiler's RVO.

有帮助吗?

解决方案

Ultimate answer to all performance related questions is profiling your whole app on hardware same(or at least similar) to what it will be running on in production environment, but here's my three cents of theorycrafting:

  • passing by value in option 3) doesn't make any sense

  • vector dydx_tmp(x.size()); <- this causes default constructing(aka zeroing) your vector. use vector dydx; dydx.reserve(x.size()); and then emplace_back() in the loop(adding _temp to your name is useless - everyone can see it's local)

  • option 2) involves input parameter which is consider bad style and there will be no copy in option 1) anyway(as explained in article you linked), so 1) is best option

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top