Question

working with a data frame

x
    Date      Val
    1/1/2012   7
    2/1/2012   9
    3/1/2012   20
    4/1/2012   24
    5/1/2012   50
a <- seq(as.Date(tail(x, 1)$Date), by="month", length=5)
a <- data.frame(a)
x.lm <- lm(x$Val ~ x$Date)

x.pre<-predict(x.lm, newdata=a)

I am getting this erro:

Warning message:
'newdata' had 5 rows but variable(s) found have 29 rows 

what am I doing wrong?

here is the dput output:

dput(x)
structure(list(Date = structure(c(14610, 14641, 14669, 14700, 
14730, 14761, 14791, 14822, 14853, 14883, 14914, 14944, 14975, 
15006, 15034, 15065, 15095, 15126, 15156, 15187, 15218, 15248, 
15279, 15309, 15340, 15371, 15400, 15431, 15461), class = "Date"), 
    Val = c(45, 51, 56, 56, 59, 60, 60, 60, 64, 65, 75, 73, 74, 
    80, 87, 91, 92, 96, 109, 108, 123, 129, 133, 143, 127, 127, 
    123, 121, 130)), .Names = c("Date", "Val"), row.names = c(NA, 
29L), class = "data.frame")
Was it helpful?

Solution

Your variable names, as stored in the x.lm model, refer to the x dataframe. There are no variables of the same names in a, so it will use those 29 from x again, which is probably not what you wanted, thus the warning. You can do the following to always use an unqualified variable named Date in the model:

a <- seq(as.Date(tail(x, 1)$Date), by="month", length=5)
a <- data.frame(Date = a)
x.lm <- lm(Val ~ Date, data=x)
x.pre<-predict(x.lm, newdata=a)

OTHER TIPS

Your data.frame a has a column named a. You created your model with columns named Val and Date so that is what its looking for.

when you make your data.frame a name that column Date and you're good to go:

a <- data.frame(Date=a)

Then it runs without the warning.

Per comment:

Edit your lm call to be:

lm(Val ~ Date, data=x)

If you can't make predict.lm() work, then you should try to write your own function using function():

yourown_function<- function(predictor1, predictor2,...){intercept+b1*predictor1+b2*predictor2+...}

use yourown_function to predict from any new dataframe:

newvalues<- yourown_function(predictor1=data.frame$predictor1, predictor2=data.frame$predictor2,....)

using the new values, you can compute residuals, MSE, etc...

Instead of x.lm <- lm(x$Val ~ x$Date, data = x) use x.lm <- lm(Val ~ Date, data = x). Removing dataset name before variable name in the lm function should help.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top