Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R how to calculate 'Percentage of Change' between two rows?

Tags:

r

  • I like to compute 'Percentage of change between two rows' while Item changes in rows.
  • For first row of items, I am ok to place some text or blank (example ‘skip’ or blank is ok) because of no comparison of previous row for same item

    df <- data.frame(period=rep(1:4,2), 
                 farm=c(rep('A',4),rep('B',4)), 
                 cumVol=c(1,5,15,31,10,12,16,24), item=c(rep('Z',3),rep('Y',3),rep('X',2)),
                 other = 1:8);
    

My desired out put of 'Percentage_Change' column (additional column to dataset) would be like below. (Formula: 100/Previous Item row value * current row Item value...Example for 2nd row..100/1 * 5 = 500)

Row Percentage_Change
1   Skip
2   500
3   300
4   Skip
5   32.25806452
6   120
7   Skip
8   150

Thanks for all your help.

like image 430
Murali Avatar asked Mar 13 '23 20:03

Murali


2 Answers

The same can be done with dplyr. This is completely analogous to akrun's answer, but uses other syntax.

library(dplyr)
mutate(df, Row = 1:n()) %>%
  group_by(item) %>%
  mutate(Percentage_Change = cumVol/lag(cumVol) * 100) %>%
  ungroup %>%
  select(Row, Percentage_Change)
## Source: local data frame [8 x 2]
## 
##     Row Percentage_Change
##   (int)             (dbl)
## 1     1                NA
## 2     2         500.00000
## 3     3         300.00000
## 4     4                NA
## 5     5          32.25806
## 6     6         120.00000
## 7     7                NA
## 8     8         150.00000
like image 106
Stibu Avatar answered Mar 15 '23 09:03

Stibu


We can use shift from data.table. Convert the 'data.frame' to 'data.table' (setDt(df)), grouped by 'item', we get the "Row" from .I and create the "Percentage_Change" by dividing the "cumVol" by the lag of "cumVol" (got by shift) and multiplying by 100. If needed the grouping column can be removed by assigning (:=) it to NULL.

library(data.table)
setDT(df)[, list(Row = .I, Percentage_Change=round(cumVol*
   (100/shift(cumVol)),2)), .(item)][, item := list(NULL)][]
#   Row Percentage_Change
#1:   1               NA
#2:   2           500.00
#3:   3           300.00
#4:   4               NA
#5:   5            32.26
#6:   6           120.00
#7:   7               NA
#8:   8           150.00

NOTE: This gives NA for elements where there are no comparison, instead of the Skip.

like image 27
akrun Avatar answered Mar 15 '23 09:03

akrun