This is weird: R's ifelse()
seems to do some (unwanted) casting:
Lets say I have a vector of timestamps (possibly NA) and NA values should be treated differently than existing dates, for example, just ignored:
formatString = "%Y-%m-%d %H:%M:%OS"
timestamp = c(as.POSIXct(strptime("2000-01-01 12:00:00.000000", formatString)) + (1:3)*30, NA)
Now
timestamp
#[1] "2000-01-01 12:00:30 CET" "2000-01-01 12:01:00 CET" "2000-01-01 12:01:30 CET"
#[6] NA
as desired but translation by 30 seconds results in
ifelse(is.na(timestamp), NA, timestamp+30)
#[1] 946724460 946724490 946724520 NA
Notice that still, timestamp+30
works as expected but lets say I want to replace NA dates by a fixed date and translate all the others by 30 secs:
fixedDate = as.POSIXct(strptime("2000-01-01 12:00:00.000000", formatString))
ifelse(is.na(timestamp), fixedDate, timestamp+30)
#[1] 946724460 946724490 946724520 946724400
Question: whats wrong with this solution and why doesn't it work as expected?
Edit: the desired output is a vector of timestamps (not of integers) translated by 30 secs and the NA's being replaced by whatever...
If you look at the way ifelse
is written, it has a section of code that looks like this:
ans <- test
ok <- !(nas <- is.na(test))
if (any(test[ok]))
ans[test & ok] <- rep(yes, length.out = length(ans))[test & ok]
Note that the answer starts off as a logical vector, the same as test. The elements that have test == TRUE
then get assigned to the value of yes
.
The issue here then is with what happens with assignment of an element or elements of a logical vector to be a date of class POSIX.ct. You can see what happens if you do this:
x <- c(TRUE, FALSE)
class(x)
# logical
x[1] <- Sys.time()
class(x)
# numeric
You could get around this by writing:
timestamp <- timestamp + 30
timestamp[is.na(timestamp)] <- fixedDate
You could also do this:
fixedDate = as.POSIXct(strptime("2000-01-01 12:00:00.000000", formatString))
unlist(ifelse(is.na(timestamp), as.list(fixedDate), as.list(timestamp+30)))
This takes advantage of the way the replacement operator [<-
handles a list on the right hand side.
You can also just re-add the class attribute like this:
x <- ifelse(is.na(timestamp), fixedDate, timestamp+30)
class(x) <- c("POSIXct", "POSIXt")
or if you were desperate to do it in one line like this:
`class<-`(ifelse(is.na(timestamp), fixedDate, timestamp+30), c("POSIXct", "POSIXt"))
or by copying the attributes of fixedDate
:
x <- ifelse(is.na(timestamp), fixedDate, timestamp+30)
attributes(x) <- attributes(fixedDate)
This last version has the advantage of copying the tzone
attribute as well.
As of dplyr 0.5.0, you can also use dplyr::if_else
which preserves class in the output and also enforces the same class for the true and false arguments.
As Henrik remarked, ifelse() strips attributes, unlike a simple for-loop.
A workaround to filling NAs without grief is the simpler and clearer function zoo::na.fill
Then you would do: na.fill(timestamp, fixedDate)
See also na.locf, na.approx, na.spline ...
, other excellent convenience functions from zoo.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With