Let's say we have a data frame defined as follows:
mydata <- data.frame(id = c('A', 'B', 'C', 'D'),
start_date = as.Date(c('2012-08-05',
'2013-05-04',
'2012-02-01',
'2015-03-02')),
end_date = as.Date(c('2014-01-12',
'2015-06-05',
'2016-05-06',
'2017-09-12')))
Where start_date
talks about the day an employee joined and end_date
talks about the day he left and id
is the unique employee id.
For each month from 5 August 2012 (the earliest start_date
) to 12 September 2017 (the latest end_date
) I would like the employee count month wise. The final output should be in a format similar to the one below: (doesn't matter if its in wide format or long format)
In the table above the columns denote the months (1 to 12), the rows the year and the cells in the table the number of employees in that month.
Any help will be highly appreciated.
Here is a solution with mapply
in base R.
# Function to get date of first day of a month (by @digEmAll)
toFirstDayOfMonth <- function(dates) dates - as.POSIXlt(dates)$mday + 1
# Generate all dates
dates <- Reduce(c, with(mydata, mapply(seq, toFirstDayOfMonth(start_date), end_date,
by = "month")))
# Count occurrences of year/month combinations
table(format(dates, "%Y"), format(dates, "%m"))
The result:
01 02 03 04 05 06 07 08 09 10 11 12
2012 0 1 1 1 1 1 1 2 2 2 2 2
2013 2 2 2 2 3 3 3 3 3 3 3 3
2014 3 2 2 2 2 2 2 2 2 2 2 2
2015 2 2 3 3 3 3 2 2 2 2 2 2
2016 2 2 2 2 2 1 1 1 1 1 1 1
2017 1 1 1 1 1 1 1 1 1 0 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With