Pergunta

I was playing around with doing a manual calculation for the OLS estimators using linear algebra in R and I got a different answer to R's inbuilt regression function lm(). Would anyone be able to tell me why there is a difference? Is R not performing OLS?

> x1<-rbind(1,2,3,4,5)
> x2<-rbind(3,65,7,2,1)
> x3<-rbind(34,7,23,2,4)
> x4<-rbind(25,50,70,90,110)
> y<-rbind(1,2,3,4,5)
> X<-as.matrix(cbind(x1,x2,x3,x4))
> Y<-as.matrix(cbind(y))
> 
> beta.hat<-solve(t(X)%*%X)%*%t(X)%*%Y
> r.regression<-lm(Y~0+X)
> 
> beta.hat
              [,1]
[1,]  1.000000e+00
[2,] -2.595146e-15
[3,]  8.174017e-15
[4,] -2.309264e-14
> r.regression

Call:
lm(formula = Y ~ 0 + X)

Coefficients:
        X1          X2          X3          X4  
 1.000e+00   3.331e-18   4.152e-17  -6.783e-17  
Foi útil?

Solução

R is performing OLS alright, the problem is in the example you provide. Here is a demonstration, building on what @DWin has commented.

set.seed(1234)
x1 <- rnorm(5,mean=3)
x2 <- rnorm(5,mean=1,sd=5)
x3 <- rnorm(5,mean=7,sd=1)
x4 <- rnorm(5,mean=1,sd=2)

X<-as.matrix(cbind(x1,x2,x3,x4))
Y<-as.matrix(cbind(y))

(beta.hat<-solve(t(X)%*%X)%*%t(X)%*%Y)
lm(Y~X+0)

As you can see, the coefficients are exactly the same, and your code is repeated except for the more suitable data.

P.S. If this is classified as an off-topic question, feel free to delete the answer as well. My intention was just to illustrate the issue with some code, which doesn't fit in a commentary.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top