문제

Learning Perceptorn can be easily accomplished using the update rule w_i=w_i + n(y-\hat{y})x.

All resources I read so far say that the learning rate n can be set to 1 w.l.g.

My question is the following, is there any proof that the Speed of convergence will always be the same, given that the data is linearly separable? Should not this also depend of the initial w vector?

도움이 되었습니까?

해결책

Citing Wikipedia:

The decision boundary of a perceptron is invariant with respect to scaling of the weight vector; that is, a perceptron trained with initial weight vector \mathbf{w} and learning rate \alpha \, behaves identically to a perceptron trained with initial weight vector \mathbf{w}/\alpha \, and learning rate 1. Thus, since the initial weights become irrelevant with increasing number of iterations, the learning rate does not matter in the case of the perceptron and is usually just set to 1.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top