Question

I have a data frame with two columns:

id  score  
1     0.5  
1     0.7  
1     0.8  
2     0.7  
2     0.8  
2     0.9  

I want to generate a new column ("new") by iterating over the rows of "score", applying one of two functions ("function1" or "function2"), conditioned on whether "id" is different or the same as the last row id value. This part I can do, my problem is that I want function2 to refer to the value generated by function1. Something like:

function1 <- function(score) {new <- score*10 return(new)}
function2 <- function(score) {new <- score*new[-1] return(new)}

id  score  new    
1     0.5  5   
1     0.7  3.5  
1     0.8  2.8  
2     0.7  7  
2     0.8  5.6  
2     0.9  5.04  

I know that apply() can't do this kind of backwards reference, but I can't for the life of me figure out how to do it with a loop. Any suggestions would be amazing as I am pulling my hair out at this point!

Was it helpful?

Solution

For the specific example in the question:

DT <- read.table(text="id  score  
1     0.5  
1     0.7  
1     0.8  
2     0.7  
2     0.8  
2     0.9  ", header=TRUE)

library(data.table)
setDT(DT)

DT[, new := 10*cumprod(score), by=id]
#   id score  new
#1:  1   0.5 5.00
#2:  1   0.7 3.50
#3:  1   0.8 2.80
#4:  2   0.7 7.00
#5:  2   0.8 5.60
#6:  2   0.9 5.04

In the more general case you'd need Reduce where I have used cumprod.

OTHER TIPS

df <- data.frame(id=rep(c(1,2),each=3), score=c(.5,.7,.8,.7,.8,.9))

This could be done relatively simply with the mutate() function in the dplyr package:

require(dplyr)    
mutate(group_by(df, id), new = 10*cumprod(score))

#Source: local data frame [6 x 3]
#Groups: id

#  id score  new
#1  1   0.5 5.00
#2  1   0.7 3.50
#3  1   0.8 2.80
#4  2   0.7 7.00
#5  2   0.8 5.60
#6  2   0.9 5.04
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top