A few potential micro-optimsations:
- Multiplication may be faster than using
pow()
to square the distances; - The common factor
-1 * G * pow(p_mass,2) / mpc_to_m
could be made a constant; - It might be slightly faster to sum
1.0 / dist_betw
, then multiply by the common factor at the end. - You might (or might not) be able to approximate the square-root accurately enough, faster than
sqrt()
. There are plenty of approximations to try. - Convert each co-ordinate to
double
just once, storing the values in another array, rather than on every iteration of the inner loop.
Algorithmically, you could discard particles that are too far from the current point to make a significant contribution to the energy. A simple modification could check the squared distance, before the expensive square-root, and move on if that's too large. You might get a further improvement by storing the particles in a spatially-aware data structure, like an Octree, or just a simple grid, so that you don't even need to look at distant points; but that might not be practical if the particles move frequently.