Question

Using dlply (from this post; code below) I am able to generate a list of linear models on subsets of my data.frame. Now that I have this list, I would like to use the models to generate values in another data.frame.

The list contains a model for each DAY and variable subset. I would like to apply the model to the same subsets in another data.frame. For example, for DAY == 1 and variable == Var.1 the model (y = mx+b) is value = -4.521869(Location) + 21.315. Using the model for the appropriate subsets, I would calculate values for Var.1 in another data.frame (e.g. dat_rec which already has entries for DAY and Location).

Is there a way to use the models from the list on the same subsets in another data.frame (e.g. use the model to for DAY == 1 and variable == Var.1 to populate values in the data.frame everywhere[e.g. different Sites] DAY == 1 and variable == Var.1) Is there a similar list method to populate a data.frame with the values calculated using the models from the list? The desired end product (i.e. dat_rec below) is data.frame.

# Data
dat <- structure(list(Site = c(32L, 32L, 32L, 32L, 10L, 10L, 10L, 10L, 
32L, 32L, 32L, 32L, 10L, 10L, 10L, 10L), Location = c(0L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), DAY = c(5L, 
55L, 555L, 5555L, 5L, 55L, 555L, 5555L, 5L, 55L, 555L, 5555L, 
5L, 55L, 555L, 5555L), Var.1 = c(20.9, 20.8, 21.03, 21.36, 21.73, 
21.18, 20.73, 21.98, 21.73, 12.48702448, 12.19642662, 12.33218874, 
11.85626285, 11.88812108, 12.70549981, 11.89587521), Var.2 = c(100L, 
100L, 100L, 100L, 100L, 100L, 100L, 100L, 90L, 90L, 90L, 91L, 
92L, 88L, 89L, 90L), Var.3 = c(14.47, 14.4, 14.3, 14.14, 14.72, 
14.62, 14.14, 14.49, 10.27287765, 10.27287765, 10.41763527, 10.51725376, 
11.12918753, 10.81166867, 10.80656509, 11.00093898), Var.4 = c(890.19, 
888.9, 889.14, 888.15, 889.57, 888.41, 887.48, 886.87, 688.15, 
698.23, 650.99, 700.01, 699, 689.6, 658.7, 689.99)), .Names = c("Site", 
"Location", "DAY", "Var.1", "Var.2", "Var.3", "Var.4"), class = "data.frame", row.names = c(NA, 
-16L))

# melt data for use with dlply
mdat <- melt(dat, id=c("DAY", "Site", "Location"))

# this dlply solution was built from here https://stackoverflow.com/a/1214432/1670053
models_mdat <- dlply(mdat, c("DAY","variable"), function(df) 
                lm(value ~ Location, data = df))

# example (partial) result, with Var.1 filled in for two DAYs
# I've only filled in the values for Var.1 using the model from the list 
# for DAY 5 and 55.
# not melted
dat_rec <- structure(list(Site = c(1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L), Location = c(0.1, 
0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 0.4), DAY = c(5L, 5L, 5L, 5L, 55L, 
55L, 55L, 55L), Var.1 = c(20.8628131, 20.4106262, 19.9584393, 
19.5062524, 20.1097573, 19.2295146, 18.3492719, 17.4690292), 
    Var.2 = c(NA, NA, NA, NA, NA, NA, NA, NA), Var.3 = c(NA, 
    NA, NA, NA, NA, NA, NA, NA), Var.4 = c(NA, NA, NA, NA, NA, 
    NA, NA, NA)), .Names = c("Site", "Location", "DAY", "Var.1", 
"Var.2", "Var.3", "Var.4"), class = "data.frame", row.names = c(NA, 
-8L))
# melted
    dat_rec_melt <- structure(list(Site = c(1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 
1L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 
1L, 1L, 3L, 3L, 3L, 3L), Location = c(0.1, 0.2, 0.3, 0.4, 0.1, 
0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 
0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 
0.4), DAY = c(5L, 5L, 5L, 5L, 55L, 55L, 55L, 55L, 5L, 5L, 5L, 
5L, 55L, 55L, 55L, 55L, 5L, 5L, 5L, 5L, 55L, 55L, 55L, 55L, 5L, 
5L, 5L, 5L, 55L, 55L, 55L, 55L), variable = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("Var.1", 
"Var.2", "Var.3", "Var.4"), class = "factor"), value = c(20.8628131, 
20.4106262, 19.9584393, 19.5062524, 20.1097573, 19.2295146, 18.3492719, 
17.4690292, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Site", 
"Location", "DAY", "variable", "value"), row.names = c(NA, -32L
), class = "data.frame")
Was it helpful?

Solution

I think you are looking for predict:

sapply(models_mdat ,predict,newdata=dat_rec)

EDIT get the result aligned with new datas:

lapply(models_mdat ,function(x)
       cbind(dat_rec,fit=predict(x,newdata=dat_rec)))

OTHER TIPS

Using the information from agstudy it appears that predict is the tool I was looking for to calculate the values from the models. Knowing that I wanted to use the model list generated to with dlply to update a data.frame with predictions I had a much better idea on what to search for to find a solution.

I found a solution in this post. To acheive the result I was looking for I need to use the model list and also the data as a list. Then predict can be used with mdply to finally arrive at an updated data.frame.

# melted
    dat_rec_melt <- structure(list(Site = c(1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 
1L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 
1L, 1L, 3L, 3L, 3L, 3L), Location = c(0.1, 0.2, 0.3, 0.4, 0.1, 
0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 
0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 0.4, 0.1, 0.2, 0.3, 
0.4), DAY = c(5L, 5L, 5L, 5L, 55L, 55L, 55L, 55L, 5L, 5L, 5L, 
5L, 55L, 55L, 55L, 55L, 5L, 5L, 5L, 5L, 55L, 55L, 55L, 55L, 5L, 
5L, 5L, 5L, 55L, 55L, 55L, 55L), variable = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("Var.1", 
"Var.2", "Var.3", "Var.4"), class = "factor"), value = c(20.8628131, 
20.4106262, 19.9584393, 19.5062524, 20.1097573, 19.2295146, 18.3492719, 
17.4690292, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Site", 
"Location", "DAY", "variable", "value"), row.names = c(NA, -32L
), class = "data.frame")

dat_rec_list <- dlply(dat_rec_melt, c("DAY", "variable"))

predictions <- mdply(cbind(mod = models_mdat, df = dat_rec_list), function(mod, df) {
  mutate(df, pred = predict(mod, newdata = df))
})
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top