I would like to pass in a vector of dates, and have returned the closest date from a second vector of (partially matching) dates.
The following function does what I require for a single date, however i cannot figure out how to generalise this to the case where searchDate
is a vector of dates.
closestDate <- function(searchDate, dateList, roundDown=FALSE){
if (roundDown) {
dist2date <- as.Date(dateList) - as.Date(searchDate)
closest <- which(max(dist2date[dist2date<=0]) == dist2date)
} else {
dist2date <- as.Date(dateList) - as.Date(searchDate)
closest <- which(min(dist2date[dist2date>=0]) == dist2date)
}
return(dateList[closest])
}
dateSeq <- seq(as.Date("2011-01-01"), as.Date("2012-12-19"), by='day')
oddDates <- dateSeq[as.logical(1:length(dateSeq) %%2)]
closestDate('2012-12-14', oddDates)
[1] "2012-12-15"
miscDatesLong <- rep(c('2012-12-14', '2012-12-16', '2012-12-18'), 100 )
closestDate(miscDatesLong, oddDates)
closestDate(miscDatesLong, oddDates)
[1] "2012-12-15" "2012-12-17" "2012-12-19"
Warning message:
In unclass(time1) - unclass(time2) :
longer object length is not a multiple of shorter object length
Could someone please help?
The findInterval
function can do this quickly:
dateSeq <- seq(as.Date("2011-01-01"), as.Date("2012-12-19"), by='day')
oddDates <- dateSeq[as.logical(1:length(dateSeq) %%2)]
oddDates[ findInterval(as.Date('2012-12-14'), oddDates)+1 ]
miscDatesLong <- rep(c('2012-12-14', '2012-12-16', '2012-12-18'), 100)
oddDates[ findInterval(as.Date(miscDatesLong), oddDates) + 1 ]
To round down instead of up remove the +1
. If you really want to find the closest date, not the one just before or after you can create a new list of dates that are the midpoints of the intervals (as.Date(rowMeans(embed(as.numeric(oddDates),2)), '1960-01-01')
) and use findInterval
on those. See the arguments to findInterval
for other options.
?Vectorize
> closestDateV = Vectorize(closestDate,"searchDate")
> closestDateV(c('2012-12-15','2012-12-14'), oddDates)
2012-12-15 2012-12-14
15689 15689
The returned values have had their date-ness removed. So add it back:
> as.Date(closestDateV(c('2012-12-15','2012-12-14'), oddDates),origin="1970-01-01")
2012-12-15 2012-12-14
"2012-12-15" "2012-12-15"
You might want to wrap that all up in a new function.
Functional programming is fun!
Now, with the example, just work on the subset of dates that are less than in one case or greater than in the other case, the particular target being examined at the time.
closestDt <- function(searchDate, dateList, roundDown=FALSE)
as.Date( sapply( searchDate , function (x) if( roundDown ){
max( dateList[ dateList <= x ] ) } else {
min( dateList[ dateList >= x]) }
), "1970-01-01")
# initiate a tie-breaking function
tie.breaker <-
function( x , y , la = look.after ){
# if look.after is TRUE, eliminate all values below x
# otherwise, eliminate all values above x
if ( la ) y[ y < x ] <- NA else y[ y > x ] <- NA
# then among the remaining values, figure out the date the shortest distance away
z <- which.min( abs( x - y ) )[1]
# use [1] to just take the first result, in case y contains duplicate dates
# return z
return( z )
}
# initiate your main function
closestDate <-
function( searchDate , dateList , look.after = FALSE ){
# apply a which.min( abs( ) ) command to each of the dates given,
# across every date in the larger list
dist2date <-
sapply(
# on every element of searchDate..
as.Date( searchDate ) ,
# ..run the tie.breaker() function
tie.breaker ,
# and each time, pass in the dateList
as.Date( dateList ) ,
# and also the look.after TRUE/FALSE flag
look.after
)
# return the matching dates in the same order as passed in
dateList[ dist2date ]
}
# try with two input dates
searchDate <- c( '2012-12-14' , '2012-11-18' )
# create a few dates to test against..
someDates <- c( '2012-11-12' , '2012-11-17' , '2012-12-15' , '2012-12-13' , '2012-12-15' , '2012-11-17' , '2012-11-20' )
# return the two dates closests to the inputted dates
# the first result gives 12/13, because look.after = FALSE
closestDate( searchDate , someDates )
# the first result gives 12/15, because look.after = TRUE
closestDate( searchDate , someDates , look.after = TRUE )
# reverse the order to prove it still works
someDates <- c( '2012-11-12' , '2012-11-17' , '2012-12-13' , '2012-12-15' , '2012-12-13' , '2012-12-15' , '2012-11-17' )
# the first result gives 12/13, because look.after = FALSE
closestDate( searchDate , someDates )
# the first result gives 12/15, because look.after = TRUE
closestDate( searchDate , someDates , look.after = TRUE )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With