Question

I'm trying to create a formula in R, of the form

Output~Var1+Var2+Var3

For use in a model. The way it seems to work is that you give Variable name you want to predict,tilde,the variable names you want to use as predictors and then in a later argument you give the data frame containing observations of those variables. The data frame I'm using, however, has quite a few Variables in it, and I don't want to type them all out. These variables also change names relatively frequently, so it would be an effort to keep changing my code. In essence, I want to know how to write

Output~(All the variables that aren't the output)

Although I also need to exclude some other Variables as well. Sorry to make it quite so clear I don't know what's going on, ?formula didn't help too much, and this isn't like any other programming or R structure I've seen before.

Thanks for any help,

N

Was it helpful?

Solution

Ah, I found a much better solution: the function

reformulate(termlabels = colnames(InputTable), response = 'Prediction')

Will create a formula from the strings you provide. Manipulate colnames as you like to dynamically choose which variables are used in the model.

OTHER TIPS

Actually, the ?formula documentation provides one possible answer. It is, however, extremely 'hacky', and one of the least pleasant ways I can imagine accomplishing this

## Create a formula for a model with a large number of variables:
xnam <- paste0("x", 1:25)
(fmla <- as.formula(paste("y ~ ", paste(xnam, collapse= "+"))))

ie, you just paste toghether a string and use that as your formula.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top