Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ifelse() stripping POSIXct attribute from vector of timestamps?

Tags:

r

posixct

This is weird: R's ifelse() seems to do some (unwanted) casting: Lets say I have a vector of timestamps (possibly NA) and NA values should be treated differently than existing dates, for example, just ignored:

formatString = "%Y-%m-%d %H:%M:%OS"
timestamp = c(as.POSIXct(strptime("2000-01-01 12:00:00.000000", formatString)) + (1:3)*30, NA)

Now

timestamp
#[1] "2000-01-01 12:00:30 CET" "2000-01-01 12:01:00 CET" "2000-01-01 12:01:30 CET"
#[6] NA    

as desired but translation by 30 seconds results in

ifelse(is.na(timestamp), NA, timestamp+30)
#[1] 946724460 946724490 946724520        NA

Notice that still, timestamp+30 works as expected but lets say I want to replace NA dates by a fixed date and translate all the others by 30 secs:

fixedDate = as.POSIXct(strptime("2000-01-01 12:00:00.000000", formatString))
ifelse(is.na(timestamp), fixedDate, timestamp+30)
#[1] 946724460 946724490 946724520 946724400

Question: whats wrong with this solution and why doesn't it work as expected?

Edit: the desired output is a vector of timestamps (not of integers) translated by 30 secs and the NA's being replaced by whatever...

like image 741
Fabian Werner Avatar asked Jun 30 '15 08:06

Fabian Werner


2 Answers

If you look at the way ifelse is written, it has a section of code that looks like this:

ans <- test
ok <- !(nas <- is.na(test))
if (any(test[ok]))
  ans[test & ok] <- rep(yes, length.out = length(ans))[test & ok]

Note that the answer starts off as a logical vector, the same as test. The elements that have test == TRUE then get assigned to the value of yes.

The issue here then is with what happens with assignment of an element or elements of a logical vector to be a date of class POSIX.ct. You can see what happens if you do this:

x <- c(TRUE, FALSE)
class(x)
# logical
x[1] <- Sys.time()
class(x)
# numeric

You could get around this by writing:

timestamp <- timestamp + 30
timestamp[is.na(timestamp)] <- fixedDate

You could also do this:

fixedDate = as.POSIXct(strptime("2000-01-01 12:00:00.000000", formatString))
unlist(ifelse(is.na(timestamp), as.list(fixedDate), as.list(timestamp+30)))

This takes advantage of the way the replacement operator [<- handles a list on the right hand side.

You can also just re-add the class attribute like this:

x <- ifelse(is.na(timestamp), fixedDate, timestamp+30)
class(x) <- c("POSIXct", "POSIXt")

or if you were desperate to do it in one line like this:

`class<-`(ifelse(is.na(timestamp), fixedDate, timestamp+30), c("POSIXct", "POSIXt"))

or by copying the attributes of fixedDate:

x <- ifelse(is.na(timestamp), fixedDate, timestamp+30)
attributes(x) <- attributes(fixedDate)

This last version has the advantage of copying the tzone attribute as well.

As of dplyr 0.5.0, you can also use dplyr::if_else which preserves class in the output and also enforces the same class for the true and false arguments.

like image 130
Nick Kennedy Avatar answered Oct 30 '22 17:10

Nick Kennedy


As Henrik remarked, ifelse() strips attributes, unlike a simple for-loop.

A workaround to filling NAs without grief is the simpler and clearer function zoo::na.fill

Then you would do: na.fill(timestamp, fixedDate)

See also na.locf, na.approx, na.spline ..., other excellent convenience functions from zoo.

like image 39
smci Avatar answered Oct 30 '22 19:10

smci