Domanda

I'm trying to do a simple linear regression on my data frame that looks something like what follows. The actual data set has more factors and more predictors (x's) all trying to predict y.

f1 f2 x y
x  a  1 3.3
x  a  2 3.2
x  a  3 3.04
x  b  1 4.5
x  b  2 4.9
x  b  3 8
y  a  1 20.1
y  a  2 20.3
y  a  3 21.9
y  b  1 101.2
y  b  2 201.8
y  b  3 332.8

Notice, for every combination of f1 & f2 the trends vary. What I want to do is build a lm model for each combination of f1 & f2, store it in some kind of list and then when I call predict, I should be able to use the appropriate model and predict y based on x. I think I should use ldply to create a list of models, as shown below

lm.model.list = ldply(x,.(f1,f2),function(x) {
 fit = lm(x$y ~ x$x)
 return(fit)
 }

This gives an error,

Error: attempt to apply non-function

Also, assume I get it all into a list, how do I work with predict after that?

edit: I realize I could use indicator variables for the factors in the modelling itself, but I want to avoid this.

È stato utile?

Soluzione

I think what you want is just:

fit <- lm(y ~ x+ f1*f2, data=dfrm)

This will give a different prediction for each level of the interaction of f1 with f2. It's just one model but could be "queried" for predictions with the predict function using any desired combo of f1 and f2. You should look at ?formula and spend some time understanding how linear models get interpreted.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top