Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

extract hours and seconds from POSIXct for plotting purposes in R

Suppose I have the following data.frame foo

           start.time duration 1 2012-02-06 15:47:00      1 2 2012-02-06 15:02:00      2 3 2012-02-22 10:08:00      3 4 2012-02-22 09:32:00      4 5 2012-03-21 13:47:00      5 

And class(foo$start.time) returns

[1] "POSIXct" "POSIXt"  

I'd like to create a plot of foo$duration v. foo$start.time. In my scenario, I'm only interested in the time of day rather than the actual day of the year. How does one go about extracting the time of day as hours:seconds from POSIXct class of vector?

like image 643
andrewj Avatar asked May 22 '12 15:05

andrewj


People also ask

What is the difference between POSIXlt and POSIXct?

There are two POSIX date/time classes, which differ in the way that the values are stored internally. The POSIXct class stores date/time values as the number of seconds since January 1, 1970, while the POSIXlt class stores them as a list with elements for second, minute, hour, day, month, and year, among others.

What does as POSIXct do in R?

as. POSIXct stores both a date and time with an associated time zone. The default time zone selected, is the time zone that your computer is set to which is most often your local time zone. POSIXct stores date and time in seconds with the number of seconds beginning at 1 January 1970.

What is POSIXct POSIXt?

class(datetime) ## [1] "POSIXct" "POSIXt" POSIXt is a virtual class which cannot be used directly. “A virtual class 'POSIXt' exists from which both of the classes inherit: it is used to allow operations such as subtraction to mix the two classes.”


1 Answers

This is a good question, and highlights some of the difficulty in dealing with dates in R. The lubridate package is very handy, so below I present two approaches, one using base (as suggested by @RJ-) and the other using lubridate.

Recreate the (first two rows of) the dataframe in the original post:

foo <- data.frame(start.time = c("2012-02-06 15:47:00",                                   "2012-02-06 15:02:00",                                  "2012-02-22 10:08:00"),                   duration   = c(1,2,3)) 

Convert to POSIXct and POSIXt class (two ways to do this)

# using base::strptime t.str <- strptime(foo$start.time, "%Y-%m-%d %H:%M:%S")  # using lubridate::ymd_hms library(lubridate) t.lub <- ymd_hms(foo$start.time) 

Now, extract time as decimal hours

# using base::format h.str <- as.numeric(format(t.str, "%H")) +                as.numeric(format(t.str, "%M"))/60  # using lubridate::hour and lubridate::minute h.lub <- hour(t.lub) + minute(t.lub)/60 

Demonstrate that these approaches are equal:

identical(h.str, h.lub) 

Then choose one of above approaches to assign decimal hour to foo$hr:

foo$hr <- h.str  # If you prefer, the choice can be made at random: foo$hr <- if(runif(1) > 0.5){ h.str } else { h.lub } 

then plot using the ggplot2 package:

library(ggplot2) qplot(foo$hr, foo$duration) +               scale_x_datetime(labels = "%S:00") 
like image 142
David LeBauer Avatar answered Sep 19 '22 12:09

David LeBauer