Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return closest date to a given date in R

Tags:

date

r

closest

My data frame consists of individual observations of individual animals. Each animal has a birthdate, that I would like to associate to the closest field season date from a date vector.

Here is a very basic reproducible example:

ID <- c("a", "b", "c", "d", "a") # individual "a" is measured twice here
birthdate <- as.Date(c("2012-06-12", "2014-06-14", "2015-11-11", "2016-09-30", "2012-06-12"))    
df <- data.frame(ID, birthdate)

# This is the date vector
season_enddates <- as.Date(c("2011-11-10", "2012-11-28", "2013-11-29", "2014-11-26", "2015-11-16", "2016-11-22", "2012-06-21", "2013-06-23", "2014-06-25", "2015-06-08", "2016-06-14"))

With the following code, I can get the difference between the birthdate and the closest season enddate.

for(i in 1:length(df$birthdate)){
  df$birthseason[i] <- which(abs(season_enddates-df$birthdate[i]) == min(abs(season_enddates-df$birthdate[i])))
}

However, what I want is the actual date, and not the difference. For example, the first value of birthseason should be 2012-06-21.

like image 901
Mehdi.K Avatar asked Jul 13 '17 13:07

Mehdi.K


People also ask

How do I find the closest value in R?

To find the row corresponding to a nearest value in an R data frame, we can use which. min function after getting the absolute difference between the value and the column along with single square brackets for subsetting the row.

How do you find the closest date to match in Excel?

Finding the future closest date to today in ExcelSelect the blank cell B2, copy and paste formula =MIN(IF(A2:A18>TODAY(),A2:A18)) into the Formula Bar, and then press Ctrl + Shift + Enter keys simultaneously. See screenshot: Then you will get the future closest date to today in cell B2.


2 Answers

It's a bit confusing since you use variables which you didn't include in your examples.

But I think this is what you want:

for (ii in 1:nrow(df))  df$birthseason[ii] <-as.character(season_enddates[which.min(abs(df$birthdate[ii] - season_enddates))])

Alternatively using lapply:

df$birthseason <- unlist(lapply(df$birthdate,function(x) as.character(season_enddates[which.min(abs(x - season_enddates))])))

Result:

> df
  ID  birthdate birthseason
1  a 2012-06-12  2012-06-21
2  b 2014-06-14  2014-06-25
3  c 2015-11-11  2015-11-16
4  d 2016-09-30  2016-11-22
5  a 2012-06-12  2012-06-21
like image 95
Val Avatar answered Nov 09 '22 01:11

Val


You are looking for which season_enddate is the closest to birthdate[1], and birthdate[2], etc.

To get the data straight, I will create an actual reproducible example:

birthdate <- as.Date(c("2012-06-12", "2014-06-14", 
                       "2015-11-11", "2016-09-30", 
                       "2012-06-12"))

season_enddates <- as.Date(c("2011-11-10", "2012-11-28", 
                             "2013-11-29", "2014-11-26",
                             "2015-11-16", "2016-11-22", 
                             "2012-06-21", "2013-06-23", 
                             "2014-06-25", "2015-06-08", 
                             "2016-06-14"))

Basically I use the function you also used, except that I decided to break it down a bit, so it's easier to follow what you're trying to do:

new.vector <- rep(0, length(birthdate))
for(i in 1:length(birthdate)){
    diffs <- abs(birthdate[i] - season_enddates)
    inds  <- which.min(diffs)
    new.vector[i] <- season_enddates[inds]
}

# new.vector now contains some dates that have been converted to numbers:
as.Date(new.vector, origin = "1970-01-01")
# [1] "2012-06-21" "2014-06-25" "2015-11-16" "2016-11-22"
# [5] "2012-06-21"
like image 38
KenHBS Avatar answered Nov 09 '22 00:11

KenHBS