Understanding truncation and rounding error in IEEE floating point system?
-
05-11-2019 - |
Pregunta
I'm trying to understand the theory behind finding the optimal $h$ value for differentiation in this definition:
$$ \frac{f(x+h) - f(x)}{h}$$
as $h$ tends to 0.
Here is my understanding:
- Truncation error comes from say approximating the value of $f(x+h)$ using the taylor series. In this case, it could be that:
$$ f(x+h) \approx f(x) + hf'(x) + \frac{h^2}{2!}f''(x) + O(h^3)$$
- Letting $f'(x)$ to be the subject since we want to find gradient $f'(x)$, we have
$$f'(x) = \frac{f(x) - (hf'(x) + \frac{h^2}{2!}f''(x) + O(h^3))}{h}$$ $$\implies f'(x) = \frac{f(x+h)-f(x)}{h} - \frac{h}{2}f''(x) + O(h^2)$$
Intuitively, we can stop at here by letting $h \approx \sqrt\epsilon$ where $\epsilon$ is the machine epsilon, we can thus limit our truncation error. But suppose we continue, then our truncation error is $\frac{h}{2}f''(x)$.
- Then rounding error comes from the interaction of the sums in the differentiation equation:
$$ \frac{f(x+h) - f(x)}{h} = \frac{[f(x) + hf'(x) + O(h^2)] - f(x)}{h}$$
But for $f'(x) \approx 1$ and since we have $h \approx \sqrt \epsilon$, we have
$$ \frac{f(x+h) - f(x)}{h} \approx 1 \pm \frac{\epsilon}{h}$$
Which means the rounding error is approximately $\frac{\epsilon}{h}$, which concurs with the fact that the left hand side of the equation is actually $f'(x)$ as well, just that we have managed to reduce the rounding error to what is seen on the right hand side.
- Now, we see that our truncation error varies linearly with $h$ while our rounding error varies inversely with $h$. To find the optimal $h$ value, we equate both errors:
$$\frac{\epsilon}{h} = \frac{h}{2}f''(x) \approx \frac{h}{2}$$ since we have $f''(x) \approx 1$.
Letting $h$ be the subject, we have $$h = \sqrt{2\epsilon}$$ which means the optimal $h$ is around $h = \sqrt{\epsilon}$ as expected.
Is my understanding correct? Does the final factor of 2 in the end matter?
Big problem in the above argument: in point 3, one already use the fact that $h = \sqrt \epsilon$. Wouldn't that already defeat the purpose of finding the optimal $h$?
No hay solución correcta