How to efficiently set data frame column equal to sum of next 48 rows of another column in R

StackOverflow https://stackoverflow.com/questions/23123124

  •  05-07-2023
  •  | 
  •  

Question

I have a data frame dat holding hourly traffic count data dat$x1 and a column in the dataframe dat$x48 where I want the sum of dat$x1 for the next 48 rows. This is a 48 hour volume for the period starting in the hour represented by the row.

I tried a for loop which was very slow. My research suggests the for loop was a bad idea, and that I should be looking at one of the apply functions instead. However, I could not figure out how to use the apply function for this purpose after checking introductions to that function.

Here is the for loop that I tried but was too slow:

for(i in 1:nrow(dat)){
dat[i,15] <- sum(dat[c(i:i+47),8]) #x1 is in column 8 and x48 is in column 15
}

For a simplified example, where I wanted only 4-hour sums, it would start like this:

x1  x4
1   NA
5   NA
3   NA
8   NA
6   NA
2   NA
1   NA
1   NA
...

I want the dataframe to end up like this where x4 is the sum of the corresponding x1 value and the next 3 x1 values.

x1  x4
1   17 
5   22
3   19
8   17
6   10
...

The dataframe has 2 million rows

Était-ce utile?

La solution

Try function rollapply from package zoo

lookahead <- 48

dat <- data.frame(x1 = 1:100, x48 = NA)

## you have to calculate number of rows returned by rollapply 
## and only use that many rows.
## i.e (nrow(dat) - lookahead + 1) rows only. 

dat[1:(nrow(dat) - lookahead + 1), "x48"] <- rollapply(dat$x1, width = lookahead, FUN = sum)
dat
##      x1  x48
## 1     1 1176
## 2     2 1224
## 3     3 1272
## 4     4 1320
## 5     5 1368
## 6     6 1416
## 7     7 1464
## 8     8 1512
## 9     9 1560
## 10   10 1608
## 11   11 1656
## 12   12 1704
## 13   13 1752
## 14   14 1800
## 15   15 1848
## 16   16 1896
## 17   17 1944
## 18   18 1992
## 19   19 2040
## 20   20 2088
## 21   21 2136
## 22   22 2184
## 23   23 2232
## 24   24 2280
## 25   25 2328
## 26   26 2376
## 27   27 2424
## 28   28 2472
## 29   29 2520
## 30   30 2568
## 31   31 2616
## 32   32 2664
## 33   33 2712
## 34   34 2760
## 35   35 2808
## 36   36 2856
## 37   37 2904
## 38   38 2952
## 39   39 3000
## 40   40 3048
## 41   41 3096
## 42   42 3144
## 43   43 3192
## 44   44 3240
## 45   45 3288
## 46   46 3336
## 47   47 3384
## 48   48 3432
## 49   49 3480
## 50   50 3528
## 51   51 3576
## 52   52 3624
## 53   53 3672
## 54   54   NA
## 55   55   NA
## 56   56   NA
## 57   57   NA
## 58   58   NA
## 59   59   NA
## 60   60   NA
## 61   61   NA
## 62   62   NA
## 63   63   NA
## 64   64   NA
## 65   65   NA
## 66   66   NA
## 67   67   NA
## 68   68   NA
## 69   69   NA
## 70   70   NA
## 71   71   NA
## 72   72   NA
## 73   73   NA
## 74   74   NA
## 75   75   NA
## 76   76   NA
## 77   77   NA
## 78   78   NA
## 79   79   NA
## 80   80   NA
## 81   81   NA
## 82   82   NA
## 83   83   NA
## 84   84   NA
## 85   85   NA
## 86   86   NA
## 87   87   NA
## 88   88   NA
## 89   89   NA
## 90   90   NA
## 91   91   NA
## 92   92   NA
## 93   93   NA
## 94   94   NA
## 95   95   NA
## 96   96   NA
## 97   97   NA
## 98   98   NA
## 99   99   NA
## 100 100   NA
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top