문제

I use R only a little bit and never use data frames, which makes understanding the correct use of predict difficult. I have my data in plain matrices, not data frames, call them a and b, which are N x p and M x p matrices respectively. I can run the regression lm(a[,1] ~ a[,-1]). I would like to use the resulting lm object to predict b[,1] from b[,-1]. My naive guess of predict(lm(a[,1] ~ a[,-1]), b[,-1]) doesn't work. What's the right syntax to use the lm to get a vector of predictions?

도움이 되었습니까?

해결책

You can store a whole matrix in one column of a data.frame:

x <- a [, -1]
y <- a [,  1]
data <- data.frame (y = y, x = I (x))
str (data)
## 'data.frame':    10 obs. of  2 variables:
## $ y: num  0.818 0.767 -0.666 0.788 -0.489 ...
## $ x: AsIs [1:10, 1:9] 0.916274.... 0.386565.... 0.703230.... -2.64091.... 0.274617.... ...

model <- lm (y ~ x)
newdata <- data.frame (x = I (b [, -1]))
predict (model, newdata) 
##         1         2 
## -3.795722 -4.778784 

The paper about the pls package, (Mevik, B.-H. and Wehrens, R. The pls Package: Principal Component and Partial Least Squares Regression in R Journal of Statistical Software, 2007, 18, 1 - 24.) explains this technique.

Another example with a spectroscopic data set (quinine fluorescence), is in vignette ("flu") of my package hyperSpec.

다른 팁

To make data.fram's out of your matrices, simply do:

m = matrix(runif(100), 10, 10)
df = as.data.frame(m)

And perform linear regression:

lm_result = lm(V1 ~ V100, df)
predicted_values = predict(lm_result, b)

Or for multiple regression:

lm_result = lm(V1 ~ V2 + V3 + V4, df)
predicted_values = predict(lm_result, b)

assuming the columns V1 - V4 are present in b.

You could compute the predictions manually:

> fit <- lm(a[,1] ~ a[,-1])
> fit$coefficients[1] + b[,-1] %*% fit$coefficients[-1]
     [,1]
[1,]    1
[2,]    2
[3,]    5

Here, fit$coefficients[1] is the intercept, and fit$coefficients[-1] are the remaning coefficients (and %*% is matrix multiplication).

I'm using lm inside a function to cycle through many different linear models then perform leave-one-out cross validation for predictions. @PaulHiemstra 's sprintf did the trick.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top