For first row of items, I am ok to place some text or blank (example ‘skip’ or blank is ok) because of no comparison of previous row for same item
df <- data.frame(period=rep(1:4,2),
farm=c(rep('A',4),rep('B',4)),
cumVol=c(1,5,15,31,10,12,16,24), item=c(rep('Z',3),rep('Y',3),rep('X',2)),
other = 1:8);
My desired out put of 'Percentage_Change' column (additional column to dataset) would be like below. (Formula: 100/Previous Item row value * current row Item value...Example for 2nd row..100/1 * 5 = 500)
Row Percentage_Change
1 Skip
2 500
3 300
4 Skip
5 32.25806452
6 120
7 Skip
8 150
Thanks for all your help.
The same can be done with dplyr
. This is completely analogous to akrun's answer, but uses other syntax.
library(dplyr)
mutate(df, Row = 1:n()) %>%
group_by(item) %>%
mutate(Percentage_Change = cumVol/lag(cumVol) * 100) %>%
ungroup %>%
select(Row, Percentage_Change)
## Source: local data frame [8 x 2]
##
## Row Percentage_Change
## (int) (dbl)
## 1 1 NA
## 2 2 500.00000
## 3 3 300.00000
## 4 4 NA
## 5 5 32.25806
## 6 6 120.00000
## 7 7 NA
## 8 8 150.00000
We can use shift
from data.table
. Convert the 'data.frame' to 'data.table' (setDt(df)
), grouped by 'item', we get the "Row" from .I
and create the "Percentage_Change" by dividing the "cumVol" by the lag
of "cumVol" (got by shift
) and multiplying by 100. If needed the grouping column can be removed by assigning (:=
) it to NULL.
library(data.table)
setDT(df)[, list(Row = .I, Percentage_Change=round(cumVol*
(100/shift(cumVol)),2)), .(item)][, item := list(NULL)][]
# Row Percentage_Change
#1: 1 NA
#2: 2 500.00
#3: 3 300.00
#4: 4 NA
#5: 5 32.26
#6: 6 120.00
#7: 7 NA
#8: 8 150.00
NOTE: This gives NA for elements where there are no comparison, instead of the Skip
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With