Question

I am trying to fit a glm() in R using a variable instead of column names but it is not working. This can help me generate glms automatically. When I fit the glm using the column names, the program runs fine, when I exchange the column names with the variable that contains the column names, the program gives and error.

Here's what my command looks like:

##The data
mydata <- structure(list(var1 = c(10L, 100L, 50L, 40L, 20L, 50L, 60L, 55L, 
45L), var2 = c(1.5, 1.2, 1, 1.4, 1.2, 1.4, 1.3, 1.4, 1.3), var3 = c(5L, 
3L, 4L, 1L, 5L, 2L, 7L, 5L, 4L), group = structure(c(1L, 1L, 
2L, 2L, 1L, 1L, 2L, 1L, 1L), .Label = c("A", "B"), class = "factor")), .Names = c("var1", 
"var2", "var3", "group"), class = "data.frame", row.names = c(NA, 
-9L))
## My variable
x <- c("var1+var2")
##Fitting the model
myglm <- glm(formula = group ~ var1+var2 , family = "binomial", data = mydata) ## works fine


myglm2 <- glm(formula = group ~ x , family = "binomial", data = mydata)
Error in model.frame.default(formula = group ~ x, data = mydata, drop.unused.levels = TRUE) : 
  variable lengths differ (found for 'x')

I tried to use paste(x) and cat(x) functions, but it did not work. Is it possible to do this in R? I need to use this because I am making around 1000 glm in a for loop.

Was it helpful?

Solution

Edit, even easier, with as.formula:

valid.names <- names(mydata)[names(mydata) != "group"]  # all but group
for(i in 2:length(valid.names)) {
  frm <- as.formula(paste("group ~", valid.names[i - 1], "+" , valid.names[i]))
  myglm <- glm(formula = frm, family = "binomial", data = mydata) ## works fine
}

Old version

Here is a potential solution using parse:

valid.names <- names(mydata[, -4])  # all but group
frm <- group ~ x
for(i in 2:length(valid.names)) {
  varplusvar <- parse(text=paste(valid.names[i - 1], "+" , valid.names[i]))[[1]]
  frm[[3]] <- varplusvar
  myglm <- glm(formula = frm, family = "binomial", data = mydata) ## works fine
}

OTHER TIPS

The function reformulate is very helpful when you want to create a formula based on a string. You don't need paste:

x <- c("var1+var2")

form <- reformulate(x, response = "group")
# group ~ var1 + var2

glm(formula = form , family = "binomial", data = mydata)

as.formula makes it very easy. For this particular instance, using the data you created in your question:

mytarget <- "group"

myFormula <- as.formula(paste(mytarget,"~ var1 + var2"))

myglm <- glm(myFormula, family = "binomial", data = mydata)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top