Adding time (seconds) to select cells in a large data set conditioned on values in another vector

StackOverflow https://stackoverflow.com/questions/19407983

  •  30-06-2022
  •  | 
  •  

Question

There is probably a basic fix for this, but being new to R, I have been unsuccessful in finding it.

I have two variables, V1 (POSIXct) and V2 (numeric). I would like to add (10-V2) seconds to V1 if V2!=0

df <- data.frame(V1=c(970068340, 970068350, 970068366, 970068376, 970068380, 
              970068394), V2= c(0,0,6,6,0,4))

I've attempted the following loop, but with more than 2 million observations, it takes much too long to execute. Is there an efficient solution to this problem?

for(i in 1:length(df$V2)) {  
    if (df$V2[i] != 0){  
   df$V1[i] = df$V1[i] + (10-df$V2[i])  
  }  
  }

For clarification, the data look like this:

     V1     V2  
  970068340  0  
  970068350  0  
  970068356  6  
  970068366  6  
  970068370  0  
  970068384  4 

and I would like to transform it to the following:

     V1      V2  
  970068340  0  
  970068350  0  
  970068360  6  
  970068370  6  
  970068370  0  
  970068390  4  
Was it helpful?

Solution

I'd use [ to subset and [<- to replace. You can do this with entirely vectorised operations (even though it looks a little untidy). Without using data.table I would reckon this would be the fastest way in base R...

rows <- df$V2 != 0
df[ rows , "V1" ] <- df[ rows , "V1" ] + 10 - df[ rows , "V2" ]
#         V1 V2
#1 970068340  0
#2 970068350  0
#3 970068370  6
#4 970068380  6
#5 970068380  0
#6 970068400  4

OTHER TIPS

Another option is:

transform(df,V1=V1+(10-V2)*as.logical(V2))
         V1 V2
1 970068340  0
2 970068350  0
3 970068370  6
4 970068380  6
5 970068380  0
6 970068400  4

df$V1 = with(df, {V1 + ifelse(V2!=0,10-V2,0)})

library(data.table)
dt = data.table(df)

dt[V2 != 0, V1 := V1 + 10 - V2]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top