Why do we determine the values of λ in regularization as ln λ, such as ln λ=-18 instead of for example λ=0.3?
-
02-11-2019 - |
Question
I'm studying Pattern Recognition and Machine Learning by Christopher Bishop. What I realized is, he defines values of λ as ln λ. For example:
We see that, for a value of lnλ = −18, the over-fitting has been suppressed and we now obtain a much closer representation of the underlying function sin(2πx). If, however, we use too large a value for λ then we again obtain a poor fit, as shown in Figure 1.7 for lnλ = 0
What is the reason for this? Why he doesn't just use λ?
No correct solution
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange