Question

I have been trying to verify that OLS estimators are consistent under the usual assumptions.

Could you please tell me how I could generate some observations from a linear model so that afterwards I can run a regression on that data and hopefully verify the desirable properties of OLS?

Thank you in advance,

Was it helpful?

Solution

Not clear if this is what you want??

# sample data: x1 and x2 uncorrelated
df <- data.frame(x1=sample(1:100,100),x2=sample(1:100,100))
# y = 1 +2.5*x1 - 3.2*x2 + N(0,5)
df$y <- with(df,1 + 2.5*x1 -3.2*x2 + rnorm(100,0,5))

fit  <- lm(y~x1+x2, data=df)
summary(fit)
#...
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -9.8951 -2.6056 -0.4384  3.6082  9.5044 
# Coefficients:
#             Estimate Std. Error  t value Pr(>|t|)    
# (Intercept)    1.954      1.263    1.548    0.125    
# x1             2.516      0.016  157.257   <2e-16 ***
# x2            -3.237      0.016 -202.306   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Residual standard error: 4.611 on 97 degrees of freedom
# Multiple R-squared:  0.9986,  Adjusted R-squared:  0.9986 
# F-statistic: 3.48e+04 on 2 and 97 DF,  p-value: < 2.2e-16

Note that se ~ 4.6 which agrees with "true" se = 5. Also note the (Intercept) is estimated poorly because se(y|x) = 5.

par(mfrow=c(2,2))
plot(fit)

Note that Q-Q plot confirms normality.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top