Why the 0-1 loss function (being the most obvious and informative from the standpoint of conceptual binary classification models) is not used in the perceptron or SVM algorithm?

有帮助吗?

解决方案

In the case of perceptrons, most of the time they are trained using gradient descent (or something similar) and the 0-1 loss function is flat so it doesn't converge well (not to mention that it's not differentiable at 0)

SVM is based on solving an optimization problem that maximize the margin between classes. So in this context a convex loss function is preferable so we can use several general convex optimization methods. The 0-1 loss function is not convex so it is not very useful either. Note that this is due the current state of art, but if a new method that optimize non convex functions efficiently is discovered then that would change.

Edit: typo

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top