Frage

I am running a simple multivariate regression on a panel/time-series dataset, using lm() and the underlying formula $(X'X)^{-1} X'Y$

expecting to get the same coefficient values from the two methods. However, I get completely different estimates.

Here is the R code:

  return = matrix(ret.ff.zoo, ncol = 50)  # y vector
  data = cbind(df$EQ, df$EFF, df$SIZE, df$MOM, df$MSCR, df$SY, df$UMP)   # x vector

  #First method     
  BETA = solve(crossprod(data)) %*% crossprod(data, return)

  #Second method
  OLS <- lm(return ~ data)

I am not sure why the estimates are different between the two methods..

Any help is appreciated! Thank you.

War es hilfreich?

Lösung

Your example isn't reproducible, but if you try it with some dummy data, the matrix formula and lm produce the same results when you take out the intercept:

set.seed(1)

x <- matrix(rnorm(1000),ncol=5)
y <- rnorm(200)

solve(t(x) %*% x) %*% t(x) %*% y
              [,1]
[1,] -0.0826496646
[2,] -0.0165735273
[3,] -0.0009412659
[4,]  0.0070475728
[5,] -0.0642452777
> lm(y ~ x + 0)

Call:
lm(formula = y ~ x + 0)

Coefficients:
        x1          x2          x3          x4          x5  
-0.0826497  -0.0165735  -0.0009413   0.0070476  -0.0642453  
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top