Question

I would highly appreciate if somebody could help me out with this. This looks simple but I have no clue how to go about it.

I am trying to work out the percentage change in one row with respect to the previous one. For example: my data frame looks like this:

day          value  

 1           21
 2           23.4
 3           10.7
 4           5.6 
 5           3.2
 6           35.2  
 7           12.9
 8           67.8
 .            .
 .            .  
 .            .
 365         27.2

What I am trying to do is to calculate the percentage change in each row with respect to previous row. For example:

 day              value   

 1                  21
 2          (day2-day1/day1)*100
 3          (day3-day2/day2)*100
 4          (day4-day3/day3)*100 
 5          (day5-day4/day4)*100
 6          (day6-day5/day5)*100  
 7          (day7-day6/day6)*100
 8          (day8-day7/day7)*100
 .                  .
 .                  .  
 .                  .
 365        (day365-day364/day364)*100

and then print out only those days where the there was a percentage increase of >50% from the previous row

Many thanks

Was it helpful?

Solution

You are looking for diff(). See its help page by typing ?diff. Here are the indices of days that fulfill your criterion:

> value <- c(21,23.4,10.7,5.6,3.2,35.2,12.9,67.8)
> which(diff(value)/head(value,-1)>0.5)+1
[1] 6 8

OTHER TIPS

Use diff:

value <- 100*diff(value)/value[2:length(value)]

Here's one way:

dat <- data.frame(day = 1:10, value = 1:10)

dat2 <- transform(dat, value2 = c(value[1], diff(value) / head(value, -1) * 100))
   day value    value2
1    1     1   1.00000
2    2     2 100.00000
3    3     3  50.00000
4    4     4  33.33333
5    5     5  25.00000
6    6     6  20.00000
7    7     7  16.66667
8    8     8  14.28571
9    9     9  12.50000
10  10    10  11.11111

dat2[dat2$value2 > 50, ]
  day value value2
2   2     2    100

You're looking for the difffunction :

x<-c(3,1,4,1,5)
diff(x)
[1] -2  3 -3  4

Here is another way:

#dummy data
df <- read.table(text="day          value  
1           21
 2           23.4
 3           10.7
 4           5.6 
 5           3.2
 6           35.2  
 7           12.9
 8           67.8", header=TRUE)

#get index for 50% change
x <- sapply(2:nrow(df),function(i)((df$value[i]-df$value[i-1])/df$value[i-1])>0.5)

#output
df[c(FALSE,x),]
#   day value
#6   6  35.2
#8   8  67.8
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top