문제

According to questions on the internet, the bias is a learnable parameter, and there are different solutions to updating it, but I failed to find a concise methodology of correctly updating biases during training.

When I tried to overfit a small network, it failed when the bias was introduced into the training: enter image description here

When I tried to scale down bias updates it produeced similar, but delayed results: enter image description here

Next I tried to make bias updates porportional to the train set error, again only delaying the trend: enter image description here

Next I tried to make bias updated inversely porportional to the error in the training set, but I suppose this would not have any shown benefits without a validation set. Alas the effect was the same: enter image description here

According to @Noah Weber, Bias is something that would help in reducing overfitting during training, which is actually consistent with my previous experiments.

Based on this I would suppose the more overfitting occurs, the more the bias term should be updated. This can clearly be measured by the differences in the error and test set. Should bias be updated according to that?

도움이 되었습니까?

해결책

I think you are mixing up the bias of a model as in here, with the bias terms of a neural network which are just the constant term of the linear model of each layer. Updating the biases for training will not reduce overfitting since each bias is an additional parameter of the model. Remember that the weights (and the bias is also a weight) are updated proportionally to the negative gradient of the loss function. Therefore there must be an error in your implementation since the training error gets larger which is highly unlikely for gradient descent (unless your learning rate is far too high).

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top