Just a very quick question, I want to run the regression using MASS. The dependent variable are val1, val2, val3 respectively and independent variables are a, b, c, d.

Just look at the fake data.

library(data.table)
library(MASS)
test <- data.table(val1 = 1:10, val2 = 11:20, val3 = 21:30, a = rnorm(10), b = rnorm(10), c = rnorm(10), d = rnorm(10))
summary1 <- glm.nb(val1 ~ a + b + c + d, data = test)
summary2 <- glm.nb(val2 ~ a + b + c + d, data = test)
summary3 <- glm.nb(val3 ~ a + b + c + d, data = test)

I think the code is ugly. I tried this

for (i in c("val1", "val2", "val3")){
paste("sum_", c("val1", "val2", "val3"), sep = "") <- glm.nb(i ~ a + b + c + d, data = simple)
}

But it didn't work. Any suggestions about the improvements? In the original data, there're about 26 independent variables, and I think it will be more ugly if the code is like this sum1 <- glm.nb(val3 ~ a + b + c + d + e + f+ g + h + i + j + k + l, data = test)

I know the following code might be helpful, but I don't know how to use them...:(

diff <- setdiff(colnames(test),c('val1','val2','val3'))

Also, I wonder whether lapply function can achieve this within data.table?

Thanks a lot!

有帮助吗?

解决方案

Better to put your data in the long format :

library(plyr)
library(reshape2)
xx <- melt(test,measure.vars=paste0('val',1:3))
ddply(xx,.(variable),function(x){
  coef(glm.nb(value~.,data=subset(x,select=-variable)))
})

 variable (Intercept)            a            b           c          d
1     val1    1.583602 -0.045909060 -0.018189342 0.026293033 0.29708648
2     val2    2.704601 -0.014641683 -0.003836401 0.006711503 0.10445377
3     val3    3.217729 -0.008925782 -0.001863267 0.003475509 0.06292286

If you want all the model not just the coefficients:

dlply(xx,.(variable),function(x){
  glm.nb(value~.,data=subset(x,select=-variable))
})

其他提示

Using your loop approach I would simply store all my models in a list like so

results <- list()

for (i in c("val1", "val2", "val3")){
  frml <- paste(i, "~ a + b + c + d")
  frml <- as.formula(frml)

  results[[i]] <- glm.nb(frml, data = simple)
}

And then access the models in the list by looking at results$val1 etc.

And here is a solution with lapply:

summary.list<-lapply(test[, .SD, .SDcols=patterns('val')],
                     function(i) glm.nb(i ~ a + b + c + d, data = test))
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top