Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R- Calculate a count of items over time using start and end dates

Tags:

r

dplyr

duration

I want to calculate a count of items over time using their Start and End dates.

Some sample data

START <- as.Date(c("2014-01-01", "2014-01-02","2014-01-03","2014-01-03"))
END <- as.Date(c("2014-01-04", "2014-01-03","2014-01-03","2014-01-04"))
df <- data.frame(START,END)
df

gives

       START        END
1 2014-01-01 2014-01-04
2 2014-01-02 2014-01-03
3 2014-01-03 2014-01-03
4 2014-01-03 2014-01-04

A table showing a count of these items across time (based on their Start and End times) is as follows:

DATETIME    COUNT
2014-01-01   1 
2014-01-02   2 
2014-01-03   4 
2014-01-04   2 

Can this be done using R, especially using dplyr? Many thanks.

like image 752
Dave M Avatar asked Oct 10 '14 00:10

Dave M


People also ask

How do you count occurrences of a value in R?

To count occurrences between columns, simply use both names, and it provides the frequency between the values of each column. This process produces a dataset of all those comparisons that can be used for further processing. It expands the variety a comparison you can make.

Is there a counting function in R?

count conditionally in RYou can use base R to create conditions and count the number of occurrences in a column. If you are an Excel user, it is similar to function COUNTIF.

How do I count a column in R?

The ncol() function in R programming That is, ncol() function returns the total number of columns present in the object.


1 Answers

This would do it. You can change the column names as necessary.

as.data.frame(table(Reduce(c, Map(seq, df$START, df$END, by = 1))))
#         Var1 Freq
# 1 2014-01-01    1
# 2 2014-01-02    2
# 3 2014-01-03    4
# 4 2014-01-04    2

As noted in the comments, Var1 in the above solution is now a factor, and not a date. To keep the date class in the first column, you could do some more work to the above solution, or use plyr::count instead of as.data.frame(table(...))

library(plyr)
count(Reduce(c, Map(seq, df$START, df$END, by = 1)))
#            x freq
# 1 2014-01-01    1
# 2 2014-01-02    2
# 3 2014-01-03    4
# 4 2014-01-04    2
like image 188
Rich Scriven Avatar answered Nov 04 '22 15:11

Rich Scriven