Why does average damping magically speed up the convergence of fixed-point calculators?

https://stackoverflow.com/questions/3860929

27-09-2019
|

Question

I'm reading through SICP, and the authors brush over the technique of average damping in computing the fixed points of functions. I understand that it's necessary in certain cases, ie square roots in order to damp out the oscillation of the function y = x/y however, I don't understand why it magically aids the convergence of the fixed point calculating function. Help?

edit

Obviously, I've thought this through somewhat. I can't seem to wrap my head around why averaging a function with itself would speed up convergence when applied repeatedly.

Solution

It only speeds up those functions whose repeated applications "hop around" the fixpoint. Intuitively, it's like adding a brake to a pendulum - it'll stop sooner with the brake.

But not every function has this property. Consider f(x)=x/2. This function will converge sooner without the average damping (log base 2 steps vs log base (4/3) steps), because it approaches the fixpoint from one side.

OTHER TIPS

While I can't answer your question on a mathematical basis, I'll try on an intuitive one: fixpoint techniques need a "flat" function graph around their ..well.. fixpoint. This means: if you picture your fixpoint function on an X-Y chart, you'll see that the function crosses the diagonal (+x,+y) exactly at the true result. In one step of your fixpoint algorithm you are guessing an X value which needs to be within the interval around the intersection point where the first derivative is between (-1..+1) and take the Y value. The Y that you took will be closer to the intersection point because starting from the intersection it is reachable by following a path which has a smaller slope than +/-1 , in contrast to the previous X value that you utilized, which has in this sense, the exact slope -1. It is immediately clear now that the smaller the slope, the more way you make towards the intersection point (the true function value) when using the Y as new X. The best interpolation function is trivially a constant, which has slope 0, giving you the true value in the first step.

Sorry to all mathematicians.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow