I am working on a data frame that contains 2 columns as follows:
time frequency
2014-01-06 13
2014-01-07 30
2014-01-09 56
My issue is that I am interested in counting the days of which frequency is 0. The data is pulled using RPostgreSQL/RSQLite so there is no datetime given unless there is a value (i.e. unless frequency is at least 1). If I was interested in counting these dates that don't actually exist in the data frame, is there an easy way to go about doing it? I.E. If we consider the date range 2014-01-01 to 20-14-01-10, I would want it to count 7
My only thought was to brute force create a separate dataframe with every date (note that this is 4+ years of dates which would be an immense undertaking) and then merging the two dataframes and counting the number of NA values. I'm sure there is a more elegant solution than what I've thought of.
Thanks!
Sort by date and then look for gaps.
start <- as.Date("2014-01-01")
time <- as.Date(c("2014-01-06", "2014-01-07","2014-01-09"))
end <- as.Date("2014-01-10")
time <- sort(unique(time))
# Include start and end dates, so the missing dates are 1/1-1/5, 1/8, 1/10
d <- c(time[1]- start,
diff(time) - 1,
end - time[length(time)] )
d # [1] 5 0 1 1
sum(d) # 7 missing days
And now for which days are missing...
(gaps <- data.frame(gap_starts = c(start,time+1)[d>0],
gap_length = d[d>0]))
# gap_starts gap_length
# 1 2014-01-01 5
# 2 2014-01-08 1
# 3 2014-01-10 1
for (g in 1:nrow(gaps)){
start=gaps$gap_starts[g]
length=gaps$gap_length[g]
for(i in start:(start+length-1)){
print(as.Date(i, origin="1970-01-01"))
}
}
# [1] "2014-01-01"
# [1] "2014-01-02"
# [1] "2014-01-03"
# [1] "2014-01-04"
# [1] "2014-01-05"
# [1] "2014-01-08"
# [1] "2014-01-10"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With