sequential subtraction in r

https://stackoverflow.com/questions/23657372

22-07-2023
|

Question

I would highly appreciate if somebody could help me out with this. This looks simple but I have no clue how to go about it.

I am trying to work out the percentage change in one row with respect to the previous one. For example: my data frame looks like this:

day          value  

 1           21
 2           23.4
 3           10.7
 4           5.6 
 5           3.2
 6           35.2  
 7           12.9
 8           67.8
 .            .
 .            .  
 .            .
 365         27.2

What I am trying to do is to calculate the percentage change in each row with respect to previous row. For example:

 day              value   

 1                  21
 2          (day2-day1/day1)*100
 3          (day3-day2/day2)*100
 4          (day4-day3/day3)*100 
 5          (day5-day4/day4)*100
 6          (day6-day5/day5)*100  
 7          (day7-day6/day6)*100
 8          (day8-day7/day7)*100
 .                  .
 .                  .  
 .                  .
 365        (day365-day364/day364)*100

and then print out only those days where the there was a percentage increase of >50% from the previous row

Many thanks

Solution

You are looking for diff(). See its help page by typing ?diff. Here are the indices of days that fulfill your criterion:

> value <- c(21,23.4,10.7,5.6,3.2,35.2,12.9,67.8)
> which(diff(value)/head(value,-1)>0.5)+1
[1] 6 8

OTHER TIPS

Use diff:

value <- 100*diff(value)/value[2:length(value)]

Here's one way:

dat <- data.frame(day = 1:10, value = 1:10)

dat2 <- transform(dat, value2 = c(value[1], diff(value) / head(value, -1) * 100))
   day value    value2
1    1     1   1.00000
2    2     2 100.00000
3    3     3  50.00000
4    4     4  33.33333
5    5     5  25.00000
6    6     6  20.00000
7    7     7  16.66667
8    8     8  14.28571
9    9     9  12.50000
10  10    10  11.11111

dat2[dat2$value2 > 50, ]
  day value value2
2   2     2    100

You're looking for the difffunction :

x<-c(3,1,4,1,5)
diff(x)
[1] -2  3 -3  4

Here is another way:

#dummy data
df <- read.table(text="day          value  
1           21
 2           23.4
 3           10.7
 4           5.6 
 5           3.2
 6           35.2  
 7           12.9
 8           67.8", header=TRUE)

#get index for 50% change
x <- sapply(2:nrow(df),function(i)((df$value[i]-df$value[i-1])/df$value[i-1])>0.5)

#output
df[c(FALSE,x),]
#   day value
#6   6  35.2
#8   8  67.8

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow