I would like to calculate sum of sales from the beggining of the year to the newest date.
My data:
ID Date Sales
1 11-2016 100
1 12-2016 100
1 01-2017 200
1 02-2017 300
MY YTD should be 200+300
You could use dplyr to summarise by year. lubridate is also useful to group_by year:
df1<-read.table(text="ID Date Sales
1 11-2016 100
1 12-2016 100
1 01-2017 200
1 02-2017 300",header=TRUE, stringsAsFactors=FALSE)
df1$Date <- as.yearmon(df1$Date,format="%m-%Y")
library(dplyr);library(lubridate)
df1%>%
group_by(Year=year(Date))%>%
summarise(Sales=sum(Sales))
Year Sales
<dbl> <int>
1 2016 200
2 2017 500
In a business setting, year-to-date sales (YTD) are usually calculated for each month. None of the current answers does that, so here is a solution using the latest version of tidyverse and base::cumsum().
library(tidyverse)
# changed input data to help understanding the results
import_sales <- tibble::tribble(
~ID, ~Date, ~Sales,
"1", "10-2016", 120,
"2", "11-2016", 240,
"3", "12-2016", 450,
"4", "01-2017", 360,
"5", "02-2017", 230,
"6", "03-2017", 310,
)
sales_data <- import_sales %>%
dplyr::mutate(
Date = lubridate::my(Date),
Year = lubridate::year(Date),
Month = lubridate::month(Date),
) %>%
dplyr::group_by(
Year,
Month
) %>%
dplyr::summarise(
Sales_Month = base::sum(Sales, na.rm = TRUE),
.groups = "drop"
) %>%
dplyr::arrange(
Year,
Month
)
sales_data %>%
dplyr::group_by(
Year
) %>%
dplyr::mutate(
Sales_YTD = base::cumsum(dplyr::coalesce(Sales_Month, 0)),
Sales_Year = base::sum(Sales_Month, na.rm = TRUE),
) %>%
dplyr::ungroup() %>%
dplyr::select(
Year,
Month,
Sales_Month,
Sales_YTD,
Sales_Year,
)
Here is the result:
# A tibble: 6 × 5
Year Month Sales_Month Sales_YTD Sales_Year
<dbl> <dbl> <dbl> <dbl> <dbl>
1 2016 10 120 120 810
2 2016 11 240 360 810
3 2016 12 450 810 810
4 2017 1 360 360 900
5 2017 2 230 590 900
6 2017 3 310 900 900
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With