Logo Questions Linux Laravel Mysql Ubuntu Git Menu

tidyverse: row wise calculations by group





I am trying to do an inventory calculation in R which requires a row wise calculation for each Mat-Plant combination. Here's a test data set -

df <- structure(list(Mat = c("A", "A", "A", "A", "A", "A", "B", "B"
), Plant = c("P1", "P1", "P1", "P2", "P2", "P2", "P1", "P1"), 
    Day = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L), UU = c(0L, 10L, 
    0L, 0L, 0L, 120L, 10L, 0L), CumDailyFcst = c(11L, 22L, 33L, 
    0L, 5L, 10L, 20L, 50L)), .Names = c("Mat", "Plant", "Day", 
"UU", "CumDailyFcst"), class = "data.frame", row.names = c(NA, 

  Mat Plant Day  UU CumDailyFcst
1   A    P1   1   0           11
2   A    P1   2  10           22
3   A    P1   3   0           33
4   A    P2   1   0            0
5   A    P2   2   0            5
6   A    P2   3 120           10
7   B    P1   1  10           20
8   B    P1   2   0           50

I need a new field "EffectiveFcst" such that when Day = 1 then EffectiveFcst = CumDailyFcst and for following days -

enter image description here

Here's the desired output -

  Mat Plant Day  UU CumDailyFcst EffectiveFcst
1   A    P1   1   0           11            11
2   A    P1   2  10           22            22
3   A    P1   3   0           33            23
4   A    P2   1   0            0             0
5   A    P2   2   0            5             5
6   A    P2   3 120           10            10
7   B    P1   1  10           20            20
8   B    P1   2   0           50            40

I am currently using a for loop but the actual table is >300K rows so hoping to do this with tidyverse for more elegant and faster approach. Tried the following but didn't work out -

group_by(df, Mat, Plant) %>%
  mutate(EffectiveFcst = ifelse(row_number()==1, CumDailyFcst, 0)) %>%
  mutate(EffectiveFcst = ifelse(row_number() > 1, CumDailyFcst - lag(CumDailyFcst, default = 0) + max(lag(EffectiveFcst, default = 0) - lag(UU, default = 0), 0), EffectiveFcst)) %>%
  print(n = nrow(.))
like image 485
Shree Avatar asked Oct 09 '18 19:10


People also ask

How do you calculate row wise sum in R?

Row wise sum of the dataframe using dplyr: Method 1 rowSums() function takes up the columns 2 to 4 and performs the row wise operation with NA values replaced to zero. row wise sum is performed using pipe (%>%) operator of the dplyr package.

How do I sum across rows in R dplyr?

Syntax: mutate(new-col-name = rowSums(.)) The rowSums() method is used to calculate the sum of each row and then append the value at the end of each row under the new column name specified. The argument . is used to apply the function over all the cells of the data frame.

How do I sum two rows in R?

First of all, create a data frame. Then, using plus sign (+) to add two rows and store the addition in one of the rows. After that, remove the row that is not required by subsetting with single square brackets.

1 Answers

We can use accumulate from purrr

df %>% 
   group_by(Mat, Plant) %>% 
   mutate(EffectiveFcst =  accumulate(CumDailyFcst - lag(UU, default = 0),  ~ 
          .y , .init = first(CumDailyFcst))[-1] ) 
# A tibble: 8 x 6
# Groups:   Mat, Plant [3]
#  Mat   Plant   Day    UU CumDailyFcst EffectiveFcst
#  <chr> <chr> <int> <int>        <int>         <dbl>
#1 A     P1        1     0           11            11
#2 A     P1        2    10           22            22
#3 A     P1        3     0           33            23
#4 A     P2        1     0            0             0
#5 A     P2        2     0            5             5
#6 A     P2        3   120           10            10
#7 B     P1        1    10           20            20
#8 B     P1        2     0           50            40
like image 191
akrun Avatar answered Oct 29 '22 00:10
