In 2013, the switch from Central European Time (CET) to Central European Summer Time (CEST) took place on Sunday 2013-03-31. Clocks are advanced by one hour from 2am to 3pm, so basically there is no 2am.
start <- strptime("2013-03-31 01:00:00", format="%F %T", tz="CET")
times <- start + (0:5) * 60*15
times
[1] "2013-03-31 01:00:00 CET" "2013-03-31 01:15:00 CET"
[3] "2013-03-31 01:30:00 CET" "2013-03-31 01:45:00 CET"
[5] "2013-03-31 03:00:00 CEST" "2013-03-31 03:15:00 CEST"
Rounding the vector times to hours gives NAs. Even for times before 01:30, which aren't affected by the transition at all.
library(lubridate)
round_date(times, unit = "hour")
[1] "2013-03-31 01:00:00 CET" NA
[3] NA NA
[5] NA "2013-03-31 03:00:00 CEST"
This seems to be a bug, or am I missing something? I am running:
sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Austria.1252 LC_CTYPE=German_Austria.1252
[3] LC_MONETARY=German_Austria.1252 LC_NUMERIC=C
[5] LC_TIME=German_Austria.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.3.3
loaded via a namespace (and not attached):
[1] digest_0.6.4 memoise_0.2.1 plyr_1.8.1 Rcpp_0.11.2 stringr_0.6.2
It looks like the culprit is ceiling_date which is called by round_date:
ceiling_date(times,"hour")
[1] "2013-03-31 01:00:00 CET" NA
[3] NA NA
[5] NA "2013-03-31 04:00:00 CEST"
Looking at the code it works by adding 1 to the hour, thereby creating a non-existant time. It is definitely a bug.
base::round has support for times to do what you want though:
round(times,"hour")
[1] "2013-03-31 01:00:00 CET" "2013-03-31 01:00:00 CET"
[3] "2013-03-31 03:00:00 CEST" "2013-03-31 03:00:00 CEST"
[5] "2013-03-31 03:00:00 CEST" "2013-03-31 03:00:00 CEST"
It's an edge case and you could consider the behavior a bug. round_date uses ceiling_date and there this happens:
y <- floor_date(times - eseconds(1), "hour")
#[1] "2013-03-31 00:00:00 CET" "2013-03-31 01:00:00 CET" "2013-03-31 01:00:00 CET" "2013-03-31 01:00:00 CET" "2013-03-31 01:00:00 CET" "2013-03-31 03:00:00 CEST"
hour(y) <- hour(y) + 1
#[1] "2013-03-31 01:00:00 CET" NA NA NA NA "2013-03-31 04:00:00 CEST"
As you see it tries to increment 2013-03-31 01:00:00 CET by one hour and doesn't deal correctly with the time zones.
The root issue is probably in the "hour<-" POSIXct S4 method.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With