Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subtract specific number from current observationin dplyr pipe

I want to subtract a number of my choice from any given current observation before I apply a function to my data in a dplyr pipe.

For example, let's compute the mean a) based on the real observation and b) when subtracting .10 from the current observation. The solution should be applicable to other computations or functions.

Let's say, we look at ice prices of three different ices (ice_id = ice identifier) at three different days (day).

da <- data.frame(ice_id = c(1,1,1,2,2,2,3,3,3), day = c(1,2,3,1,2,3,1,2,3), price = c(1.60,1.90,1.80,2.10,2.05,2.30,0.50,0.40,0.35))

da
  ice_id day price
1      1   1  1.60
2      1   2  1.90
3      1   3  1.80
4      2   1  2.10
5      2   2  2.05
6      2   3  2.30
7      3   1  0.50
8      3   2  0.40
9      3   3  0.35

Now I want to add two columns: 1) Mean ice price at that day based on the real observations of the three ices. 2) Mean ice price at that day if only the ice in the current row would be .10 lower in price (= subtract .10 from the current price observation).

1) is clear to me, but how can I add 2)?

da = da %>%
  group_by(day) %>%
  mutate(mean_dayprice = mean(price),
         mean_dayprice_lower = ?)

For example, in the first row mean_dayprice_lower would be given by: ((1.60-.10)+2.10+.50)/3 = 1.36666

like image 675
Scijens Avatar asked Mar 03 '23 09:03

Scijens


2 Answers

For your particular problem you can simply calculate mean(price - 0.1).

However, in general you can use the following approach. Since your required operation is not vecotrized in the shift (i.e. -0.10) you could use purrr::map inside mutate:

da %>%
  group_by(day) %>%
  mutate(mean_dayprice = mean(price),
         mean_dayprice_lower = purrr::map_dbl(1:n(), ~mean(price - if_else(1:n() == .x, 0.1, 0))))
like image 57
Cettt Avatar answered Mar 04 '23 23:03

Cettt


Regardless of the entry, your second column will always decrease by 0.1 / n, where n is the number of entries in the group. So you can do:

da %>%
group_by(day) %>%
mutate(mean_dayprice = mean(price),
mean_dayprice_lower = mean_dayprice-0.1/n())

# A tibble: 9 x 5
# Groups:   day [3]
  ice_id   day price mean_dayprice mean_dayprice_lower
   <dbl> <dbl> <dbl>         <dbl>               <dbl>
1      1     1  1.6           1.4                 1.37
2      1     2  1.9           1.45                1.42
3      1     3  1.8           1.48                1.45
4      2     1  2.1           1.4                 1.37
5      2     2  2.05          1.45                1.42
6      2     3  2.3           1.48                1.45
7      3     1  0.5           1.4                 1.37
8      3     2  0.4           1.45                1.42
9      3     3  0.35          1.48                1.45
like image 28
StupidWolf Avatar answered Mar 04 '23 23:03

StupidWolf