I have a list of people and their working start and end times during a day. I want to plot a curve showing the total of people working at any given minute in the day. What I could do is just add 1440 additional conditional boolean variables for each minute of the day and sum them up, but that seems very inelegant. I'm wondering if there a better way to do it (integrals?).
Here's the code to generate a df with my sample data:
sample_wt <- function() {
require(lubridate)
set.seed(10)
worktime <- data.frame(
ID = c(1:100),
start = now()+abs(rnorm(100,4800,2400))
)
worktime$end <- worktime$start + abs(rnorm(100,20000,10000))
worktime$length <- difftime(worktime$end, worktime$start, units="mins")
worktime
}
To create a sample data , you can do something like:
DF <- sample_wt()
Here one option using IRanges
package from Bioconductor.
library(IRanges)
## generate sample
DF <- sample_wt()
## create the range from the sample data
rangesA <- IRanges(as.numeric(DF$start), as.numeric(DF$end))
## create one minute range
xx = seq(min(DF$start),max(DF$end),60)
rangesB <- IRanges(as.numeric(xx),as.numeric(xx+60))
## count the overlaps
ov <- countOverlaps(rangesB, rangesA, type="within")
## plot the result
plot(xx,ov,type='l')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With