I want to calculate a count of items over time using their Start and End dates.
Some sample data
START <- as.Date(c("2014-01-01", "2014-01-02","2014-01-03","2014-01-03"))
END <- as.Date(c("2014-01-04", "2014-01-03","2014-01-03","2014-01-04"))
df <- data.frame(START,END)
df
gives
START END
1 2014-01-01 2014-01-04
2 2014-01-02 2014-01-03
3 2014-01-03 2014-01-03
4 2014-01-03 2014-01-04
A table showing a count of these items across time (based on their Start and End times) is as follows:
DATETIME COUNT
2014-01-01 1
2014-01-02 2
2014-01-03 4
2014-01-04 2
Can this be done using R, especially using dplyr? Many thanks.
To count occurrences between columns, simply use both names, and it provides the frequency between the values of each column. This process produces a dataset of all those comparisons that can be used for further processing. It expands the variety a comparison you can make.
count conditionally in RYou can use base R to create conditions and count the number of occurrences in a column. If you are an Excel user, it is similar to function COUNTIF.
The ncol() function in R programming That is, ncol() function returns the total number of columns present in the object.
This would do it. You can change the column names as necessary.
as.data.frame(table(Reduce(c, Map(seq, df$START, df$END, by = 1))))
# Var1 Freq
# 1 2014-01-01 1
# 2 2014-01-02 2
# 3 2014-01-03 4
# 4 2014-01-04 2
As noted in the comments, Var1
in the above solution is now a factor, and not a date. To keep the date class in the first column, you could do some more work to the above solution, or use plyr::count
instead of as.data.frame(table(...))
library(plyr)
count(Reduce(c, Map(seq, df$START, df$END, by = 1)))
# x freq
# 1 2014-01-01 1
# 2 2014-01-02 2
# 3 2014-01-03 4
# 4 2014-01-04 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With