For binary classification irrespective of the model used, the sigmoid function is a good choice for output layer because the actual output value ‘Y’ is either 0 or 1 so it makes sense for predicted output value to be a number between 0 and 1.

My confusion is that is there a binary step function in the output layer which squashes the values of the linear combination of weights and and inputs to 0 or 1? Does classification means always applying a thresholding function on top of a linear or non-linear function which is in the hidden layer?

Say the predicted output value is 0.75 and actual Y is 0. Then, how is 0.75 converted to 1? The loss function would calculate the error as actual - predicted = 0-0.75 = -0.75

Can somebody please explain the math or point out some links where the working steps are shown? Thank you.

没有正确的解决方案

许可以下: CC-BY-SA归因
scroll top