Suppose I have the following data.frame
foo
start.time duration 1 2012-02-06 15:47:00 1 2 2012-02-06 15:02:00 2 3 2012-02-22 10:08:00 3 4 2012-02-22 09:32:00 4 5 2012-03-21 13:47:00 5
And class(foo$start.time)
returns
[1] "POSIXct" "POSIXt"
I'd like to create a plot of foo$duration
v. foo$start.time
. In my scenario, I'm only interested in the time of day rather than the actual day of the year. How does one go about extracting the time of day as hours:seconds from POSIXct
class of vector?
There are two POSIX date/time classes, which differ in the way that the values are stored internally. The POSIXct class stores date/time values as the number of seconds since January 1, 1970, while the POSIXlt class stores them as a list with elements for second, minute, hour, day, month, and year, among others.
as. POSIXct stores both a date and time with an associated time zone. The default time zone selected, is the time zone that your computer is set to which is most often your local time zone. POSIXct stores date and time in seconds with the number of seconds beginning at 1 January 1970.
class(datetime) ## [1] "POSIXct" "POSIXt" POSIXt is a virtual class which cannot be used directly. “A virtual class 'POSIXt' exists from which both of the classes inherit: it is used to allow operations such as subtraction to mix the two classes.”
This is a good question, and highlights some of the difficulty in dealing with dates in R. The lubridate package is very handy, so below I present two approaches, one using base (as suggested by @RJ-) and the other using lubridate.
Recreate the (first two rows of) the dataframe in the original post:
foo <- data.frame(start.time = c("2012-02-06 15:47:00", "2012-02-06 15:02:00", "2012-02-22 10:08:00"), duration = c(1,2,3))
Convert to POSIXct and POSIXt class (two ways to do this)
# using base::strptime t.str <- strptime(foo$start.time, "%Y-%m-%d %H:%M:%S") # using lubridate::ymd_hms library(lubridate) t.lub <- ymd_hms(foo$start.time)
Now, extract time as decimal hours
# using base::format h.str <- as.numeric(format(t.str, "%H")) + as.numeric(format(t.str, "%M"))/60 # using lubridate::hour and lubridate::minute h.lub <- hour(t.lub) + minute(t.lub)/60
Demonstrate that these approaches are equal:
identical(h.str, h.lub)
Then choose one of above approaches to assign decimal hour to foo$hr
:
foo$hr <- h.str # If you prefer, the choice can be made at random: foo$hr <- if(runif(1) > 0.5){ h.str } else { h.lub }
then plot using the ggplot2 package:
library(ggplot2) qplot(foo$hr, foo$duration) + scale_x_datetime(labels = "%S:00")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With