I have a data frame with two columns:

id  score  
1     0.5  
1     0.7  
1     0.8  
2     0.7  
2     0.8  
2     0.9  

I want to generate a new column ("new") by iterating over the rows of "score", applying one of two functions ("function1" or "function2"), conditioned on whether "id" is different or the same as the last row id value. This part I can do, my problem is that I want function2 to refer to the value generated by function1. Something like:

function1 <- function(score) {new <- score*10 return(new)}
function2 <- function(score) {new <- score*new[-1] return(new)}

id  score  new    
1     0.5  5   
1     0.7  3.5  
1     0.8  2.8  
2     0.7  7  
2     0.8  5.6  
2     0.9  5.04  

I know that apply() can't do this kind of backwards reference, but I can't for the life of me figure out how to do it with a loop. Any suggestions would be amazing as I am pulling my hair out at this point!

有帮助吗?

解决方案

For the specific example in the question:

DT <- read.table(text="id  score  
1     0.5  
1     0.7  
1     0.8  
2     0.7  
2     0.8  
2     0.9  ", header=TRUE)

library(data.table)
setDT(DT)

DT[, new := 10*cumprod(score), by=id]
#   id score  new
#1:  1   0.5 5.00
#2:  1   0.7 3.50
#3:  1   0.8 2.80
#4:  2   0.7 7.00
#5:  2   0.8 5.60
#6:  2   0.9 5.04

In the more general case you'd need Reduce where I have used cumprod.

其他提示

df <- data.frame(id=rep(c(1,2),each=3), score=c(.5,.7,.8,.7,.8,.9))

This could be done relatively simply with the mutate() function in the dplyr package:

require(dplyr)    
mutate(group_by(df, id), new = 10*cumprod(score))

#Source: local data frame [6 x 3]
#Groups: id

#  id score  new
#1  1   0.5 5.00
#2  1   0.7 3.50
#3  1   0.8 2.80
#4  2   0.7 7.00
#5  2   0.8 5.60
#6  2   0.9 5.04
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top