I need help counting number of days in a given date range. Here is my data set:
dat<- data.frame(a=c(seq(as.Date("2018-01-01"), as.Date("2018-01-3"), 1),
seq(as.Date("2018-01-08"), as.Date("2018-01-10"), 1),
seq(as.Date("2018-01-23"), as.Date("2018-01-31"), 1),
seq(as.Date("2018-03-01"), as.Date("2018-03-05"), 1)),
b= c(rep("x",5), rep("y",5), rep("x",5), rep("y",5)) )
a b
1 2018-01-01 x
2 2018-01-02 x
3 2018-01-03 x
4 2018-01-08 x
5 2018-01-09 x
6 2018-01-10 y
7 2018-01-23 y
8 2018-01-24 y
9 2018-01-25 y
10 2018-01-26 y
11 2018-01-27 x
12 2018-01-28 x
13 2018-01-29 x
14 2018-01-30 x
15 2018-01-31 x
16 2018-03-01 y
17 2018-03-02 y
18 2018-03-03 y
19 2018-03-04 y
20 2018-03-05 y
This is reports received from a ship and "x" and "y" are different types of fuel. On 01,02 and 03 Jan, ship reported it was using "x" type fuel. Then ship didn't report anything on the 4th, 5th, 6th and 7th of Jan. Ship sent another report on 8th (which is a consolidation of the 4th, 5th, 6th, 7th and 8th of Jan) that it is still using fuel type "x". If ship changes its fuel type to "y", it will send out a report.
I want to count number of days when the fuel type is "x" and number of days when fuel type is "y". If there is a gap between the dates like for example
1 2018-01-01 x
2 2018-01-02 x
3 2018-01-03 x
4 2018-01-08 x
5 2018-01-09 x
then the number of days between the 1st and 5th row should be 8 days (09/Jan -01/Jan). So count of "x" is 8 days Then it should calculate the next counter in column b which is "y".
6 2018-01-10 y
7 2018-01-23 y
8 2018-01-24 y
9 2018-01-25 y
10 2018-01-26 y
Here diff in days is 16 days (26/Jan-10-Jan). So count of "y" is 16 days.
Then again we have "x":
11 2018-01-27 x
12 2018-01-28 x
13 2018-01-29 x
14 2018-01-30 x
15 2018-01-31 x
Here, count of "x" is 4 days (31/Jan-27/Jan). So total count of "x" is (8+4)= 12 days. And we count similarly of "y".
16 2018-03-01 y
17 2018-03-02 y
18 2018-03-03 y
19 2018-03-04 y
20 2018-03-05 y
Here is the catch. Ship didn't report anything in February. Since last report fuel type use was "x", which was reported on 31/Jan, it means that entire February, ship was using fuel type "x" and therefore we need to add 28 days of february to "x" which makes it (8+4+28)= 40 days
And "y" count is (16+4)=21 days
I can't seem to understand how to code the logic. Any help would be appreciated.
data.table
approach
library(data.table)
#create sample data
dt1 <- setDT(dat)
#create a data.table with one row for each day within the range of dt1
dt2 <- data.table( a = seq( min( dt1$a ), max( dt1$a), by = "days") )
#perform rolling join to get the last 'b' from dt1 on all dates in dt2
dt2[, b := dt1[dt2, b, on = "a", roll = TRUE]][]
#summarise by b (number of rows = number of days, so we can use .N)
dt2[, (days = .N), by = "b"]
# b N
# 1: x 42
# 2: y 22
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With