Cost Function, Linear Regression, trying to avoid hard coding theta. Octave.

Question 1

You can use vectorize of operations in Octave/Matlab. Iterate over entire vector - it is really bad idea, if your programm language let you vectorize operations. R, Octave, Matlab, Python (numpy) allow this operation. For example, you can get scalar production, if theta = (t0, t1, t2, t3) and X = (x0, x1, x2, x3) in the next way: theta * X' = (t0, t1, t2, t3) * (x0, x1, x2, x3)' = t0*x0 + t1*x1 + t2*x2 + t3*x3 Result will be scalar.

For example, you can vectorize h in your code in the next way:

H = (theta'*X')';
S = sum((H - y) .^ 2);
J = S / (2*m);

Question 2

Above answer is perfect but you can also do

H = (X*theta);
S = sum((H - y) .^ 2);
J = S / (2*m);

Rather than computing

(theta' * X')'

and then taking the transpose you can directly calculate

(X * theta)

It works perfectly.

Question 3

The below line return the required 32.07 cost value while we run computeCost once using θ initialized to zeros:

J = (1/(2*m)) * (sum(((X * theta) - y).^2));

and is similar to the original formulas that is given below.

Question 4

It can be also done in a line- m- # training sets

J=(1/(2*m)) * ((((X * theta) - y).^2)'* ones(m,1));

Question 5

J = sum(((X*theta)-y).^2)/(2*m);
ans =  32.073

Above answer is perfect,I thought the problem deeply for a day and still unfamiliar with Octave,so,Just study together!

Question 6

If you want to use only matrix, so:

temp = (X * theta - y);        % h(x) - y
J = ((temp')*temp)/(2 * m);
clear temp;

Question 7

This would work just fine for you -

J =  sum((X*theta - y).^2)*(1/(2*m))

This directly follows from the Cost Function Equation

Question 8

Python code for the same :

def computeCost(X, y, theta):
    m = y.size  # number of training examples
    J = 0
    H = (X.dot(theta))
    S = sum((H - y)**2);
    J = S / (2*m);
    return J

Question 9

function J = computeCost(X, y, theta)

m = length(y);

J = 0;

% Hypothesis h(x)
h = X * theta;

% Error function (h(x) - y) ^ 2
squaredError = (h-y).^2;

% Cost function
J = sum(squaredError)/(2*m);

end

Question 10

I think we needed to use iteration for much general solution for cost rather one iteration, also the result shows in the PDF 32.07 may not be correct answer that grader is looking for reason being its a one case out of many training data.

I think it should loop through like this

  for i in 1:iteration
  theta = theta - alpha*(1/m)(theta'*x-y)*x

  j = (1/(2*m))(theta'*x-y)^2