I have a data frame that has 4 columns of dates. It should be that col1 occurs first, col2 occurs second, col3 third, and col4 last. Id like to identify which rows have dates that are not in sequence
Here is a toy data frame
col1 <- c(as.Date("2004-1-1"), as.Date("2005-1-1"), as.Date("2006-1-1"))
col2 <- c(as.Date("2004-1-2"), as.Date("2005-1-3"), as.Date("2006-1-2"))
col3 <- c(as.Date("2004-1-5"), as.Date("2005-1-9"), as.Date("2006-1-19"))
col4 <- c(as.Date("2004-1-9"), as.Date("2005-1-15"), as.Date("2006-1-10"))
dates <- data.frame(col1, col2, col3, col4)
dates
col1 col2 col3 col4
1 2004-01-01 2004-01-02 2004-01-05 2004-01-09
2 2005-01-01 2005-01-03 2005-01-09 2005-01-15
3 2006-01-01 2006-01-02 2006-01-19 2006-01-10
My desired output would be,
col1 col2 col3 col4 Seq?
1 2004-01-01 2004-01-02 2004-01-05 2004-01-09 T
2 2005-01-01 2005-01-03 2005-01-09 2005-01-15 T
3 2006-01-01 2006-01-02 2006-01-19 2006-01-10 F
I can think of a couple of solutions. Naively i'd suggest using apply with ?is.unsorted, which is:
Test if an object is not sorted (in increasing order), without the cost of sorting it.
!apply(dates, 1, is.unsorted)
#[1] TRUE TRUE FALSE
Otherwise, convert to a long set and then do a group operation, which should be faster on larger datasets:
tmp <- cbind(row=seq_len(nrow(dates)), stack(lapply(dates, as.vector)))
!tapply(tmp$values, tmp$row, FUN=is.unsorted)
And finally, the brute force method of comparing each column with the next via Map, which should be even quicker again:
Reduce(`&`, Map(`<`, dates[-length(dates)], dates[-1]))
A simple apply statement will do the trick:
dates$Seq <- apply(dates, 1, function(x) all(x == sort(x)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With