I have a problem that I have reduced to the following task. For a dataframe with IDs and dates;
set.seed(123)
myids <- sample(c('a001', 'a002', 'a003'), 12, replace = TRUE)
mydates <- as.Date(sample(c("2007-06-22", "2004-02-13", "2007-05-22", "2001-10-10", "2008-05-05", "2004-02-15"), 12, replace = TRUE))
mydf <- data.frame(myids, mydates)
I need to select only the row with the most recent date, for each subject. The result should be:
a001 5/5/08
a002 5/5/08
a003 2/15/04
Anyone know how to do this?
Here's a data.table solution.
library(data.table)
setDT(mydf)[,.SD[which.max(mydates)],keyby=myids]
# myids mydates
# 1: a001 2008-05-05
# 2: a002 2008-05-05
# 3: a003 2004-02-15
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With