Why do we determine the values of λ in regularization as ln λ, such as ln λ=-18 instead of for example λ=0.3?

https://datascience.stackexchange.com/questions/57357

regularization

02-11-2019
|

Question

I'm studying Pattern Recognition and Machine Learning by Christopher Bishop. What I realized is, he defines values of λ as ln λ. For example:

We see that, for a value of lnλ = −18, the over-fitting has been suppressed and we now obtain a much closer representation of the underlying function sin(2πx). If, however, we use too large a value for λ then we again obtain a poor fit, as shown in Figure 1.7 for lnλ = 0

What is the reason for this? Why he doesn't just use λ?

No correct solution

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange