문제

I am running a simple multivariate regression on a panel/time-series dataset, using lm() and the underlying formula $(X'X)^{-1} X'Y$

expecting to get the same coefficient values from the two methods. However, I get completely different estimates.

Here is the R code:

  return = matrix(ret.ff.zoo, ncol = 50)  # y vector
  data = cbind(df$EQ, df$EFF, df$SIZE, df$MOM, df$MSCR, df$SY, df$UMP)   # x vector

  #First method     
  BETA = solve(crossprod(data)) %*% crossprod(data, return)

  #Second method
  OLS <- lm(return ~ data)

I am not sure why the estimates are different between the two methods..

Any help is appreciated! Thank you.

도움이 되었습니까?

해결책

Your example isn't reproducible, but if you try it with some dummy data, the matrix formula and lm produce the same results when you take out the intercept:

set.seed(1)

x <- matrix(rnorm(1000),ncol=5)
y <- rnorm(200)

solve(t(x) %*% x) %*% t(x) %*% y
              [,1]
[1,] -0.0826496646
[2,] -0.0165735273
[3,] -0.0009412659
[4,]  0.0070475728
[5,] -0.0642452777
> lm(y ~ x + 0)

Call:
lm(formula = y ~ x + 0)

Coefficients:
        x1          x2          x3          x4          x5  
-0.0826497  -0.0165735  -0.0009413   0.0070476  -0.0642453  
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top