Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to output the sum of specific rows from one data frame to a new column in another data frame?

Tags:

r

I would ultimately like to have df2 with certain dates and the cumulative sum of values connected to those date ranges from df1.

df1 = data.frame("date"=c("10/01/2020","10/02/2020","10/03/2020","10/04/2020","10/05/2020",
                          "10/06/2020","10/07/2020","10/08/2020","10/09/2020","10/10/2020"),
                 "value"=c(1:10))
df1
> df1
   date       value
1  10/01/2020     1
2  10/02/2020     2
3  10/03/2020     3
4  10/04/2020     4
5  10/05/2020     5
6  10/06/2020     6
7  10/07/2020     7
8  10/08/2020     8
9  10/09/2020     9
10 10/10/2020    10
df2 = data.frame("date"=c("10/05/2020","10/10/2020"))
df2
> df2
  date
1 10/05/2020
2 10/10/2020

I realize this is incorrect, but I am not sure how to define df2$value as the sums of certain df1$value rows:

df2$value = filter(df1, c(sum(1:5),sum(6:10)))
df2

I would like the output to look like this:

> df2
   date       value
1  10/05/2020    15
2  10/10/2020    40
like image 825
CyanoSloughth Avatar asked Oct 16 '25 00:10

CyanoSloughth


1 Answers

Here is another approach using dplyr and lubridate:

library(lubridate)
library(dplyr)

df1 %>% 
  mutate(date = dmy(date)) %>%
  mutate(date = if_else(date == "2020-05-10" |
                      date == "2020-10-10", date, NA_Date_)) %>% 
  fill(date, .direction = "up") %>% 
  group_by(date) %>% 
  summarise(value = sum(value))
  date       value
  <date>     <int>
1 2020-05-10    15
2 2020-10-10    40
like image 129
TarJae Avatar answered Oct 18 '25 16:10

TarJae