My ADALINE model using Gradient Descent is increasing error on each iteration
-
16-10-2019 - |
Question
I have used the Iris Dataset's 1st and 3rd Column for the features. and the labels of Iris Setosa (-1) and Iris Versicolor (1). I am using ADALINE as a simple classification model for my dataset. I am using gradient descent as the cost minimizing function. But on every iteration the error increases. What am I doing wrong in the python code?
import numpy as np
import pandas as pd
class AdalineGD(object):
def __init__(self, eta = 0.01, n_iter = 50):
self.eta = eta
self.n_iter = n_iter
def fit (self, X, y):
"""Fit training data."""
self.w_ = np.random.random(X.shape[1])
self.cost_ = []
print ('Initial weights are: %r' %self.w_)
for i in range(self.n_iter):
output = self.net_input(X)
print ("On iteration %d, output is: %r" %(i, output))
errors = output - y
print("On iteration %d, Error is: %r" %(i, errors))
self.w_ += self.eta * X.T.dot(errors)
print ('Weights on iteration %d: %r' %(i, self.w_))
cost = (errors**2).sum() / 2.0
self.cost_.append(cost)
print ("On iteration %d, Cost is: %r" %(i, cost))
prediction = self.predict(X)
print ("Prediction after iteration %d is: %r" %(i, prediction))
input()
return self
def net_input(self, X):
"""Calculate net input"""
return X.dot(self.w_)
def activation(self, X):
"""Computer Linear Activation"""
return self.net_input(X)
def predict(self, X):
"""Return class label after unit step"""
return np.where(self.activation(X) >= 0.0, 1, -1)
####### END OF THE CLASS ########
#importing the Iris Dataset
df = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data", header = None)
y = df.iloc[0:100, 4].values
y = np.where(y == 'Iris-setosa', -1, 1)
X = df.iloc[0:100, [0, 2]].values
#Adding the ones column to the X matrix
X = np.insert(X, 0, np.ones(X.shape[0]), axis = 1)
ada = AdalineGD(n_iter = 20, eta = 0.001).fit(X, y)
Solution
I think something is wrong here.
self.w_ += self.eta * X.T.dot(errors)
You are going to the positive to the gradient while you should be doing is going to the negative direction of it
self.w_ -= self.eta * X.T.dot(errors)
or
self.w_ += -self.eta * X.T.dot(errors)
see this for more clarification.
OTHER TIPS
If you want to do
self.w_ += self.eta * X.T.dot(errors)
like i like to do.
you just have to change
errors = output - y
to
errors = y - output
Hope this helps : )
Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange