My data frame consists of individual observations of individual animals. Each animal has a birthdate, that I would like to associate to the closest field season date from a date vector.
Here is a very basic reproducible example:
ID <- c("a", "b", "c", "d", "a") # individual "a" is measured twice here
birthdate <- as.Date(c("2012-06-12", "2014-06-14", "2015-11-11", "2016-09-30", "2012-06-12"))
df <- data.frame(ID, birthdate)
# This is the date vector
season_enddates <- as.Date(c("2011-11-10", "2012-11-28", "2013-11-29", "2014-11-26", "2015-11-16", "2016-11-22", "2012-06-21", "2013-06-23", "2014-06-25", "2015-06-08", "2016-06-14"))
With the following code, I can get the difference between the birthdate and the closest season enddate.
for(i in 1:length(df$birthdate)){
df$birthseason[i] <- which(abs(season_enddates-df$birthdate[i]) == min(abs(season_enddates-df$birthdate[i])))
}
However, what I want is the actual date, and not the difference. For example, the first value of birthseason should be 2012-06-21.
To find the row corresponding to a nearest value in an R data frame, we can use which. min function after getting the absolute difference between the value and the column along with single square brackets for subsetting the row.
Finding the future closest date to today in ExcelSelect the blank cell B2, copy and paste formula =MIN(IF(A2:A18>TODAY(),A2:A18)) into the Formula Bar, and then press Ctrl + Shift + Enter keys simultaneously. See screenshot: Then you will get the future closest date to today in cell B2.
It's a bit confusing since you use variables which you didn't include in your examples.
But I think this is what you want:
for (ii in 1:nrow(df)) df$birthseason[ii] <-as.character(season_enddates[which.min(abs(df$birthdate[ii] - season_enddates))])
Alternatively using lapply
:
df$birthseason <- unlist(lapply(df$birthdate,function(x) as.character(season_enddates[which.min(abs(x - season_enddates))])))
Result:
> df
ID birthdate birthseason
1 a 2012-06-12 2012-06-21
2 b 2014-06-14 2014-06-25
3 c 2015-11-11 2015-11-16
4 d 2016-09-30 2016-11-22
5 a 2012-06-12 2012-06-21
You are looking for which season_enddate
is the closest to birthdate[1]
, and birthdate[2]
, etc.
To get the data straight, I will create an actual reproducible example:
birthdate <- as.Date(c("2012-06-12", "2014-06-14",
"2015-11-11", "2016-09-30",
"2012-06-12"))
season_enddates <- as.Date(c("2011-11-10", "2012-11-28",
"2013-11-29", "2014-11-26",
"2015-11-16", "2016-11-22",
"2012-06-21", "2013-06-23",
"2014-06-25", "2015-06-08",
"2016-06-14"))
Basically I use the function you also used, except that I decided to break it down a bit, so it's easier to follow what you're trying to do:
new.vector <- rep(0, length(birthdate))
for(i in 1:length(birthdate)){
diffs <- abs(birthdate[i] - season_enddates)
inds <- which.min(diffs)
new.vector[i] <- season_enddates[inds]
}
# new.vector now contains some dates that have been converted to numbers:
as.Date(new.vector, origin = "1970-01-01")
# [1] "2012-06-21" "2014-06-25" "2015-11-16" "2016-11-22"
# [5] "2012-06-21"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With