Last days, I've pondering the following problem:I want to plot the duration of an activity as a line, with the y axis representing the date the activity happens and the x axis representing time in hours. When I deal with an activity that starts and ends in the same day this is easy.
For example:
df1<-structure(list(Date = structure(c(16802, 16803, 16805, 16806,
16809, 16810, 16812, 16813, 16816, 16820, 16821, 16822, 16829
), class = "Date"), hms1 = structure(c(1457623680, 1457620860,
1457621160, 1457622540, 1457625600, 1457621280, 1457620380, 1457619720,
1457623620, 1457621460, 1457620440, 1457617980, 1457621880), class = c("POSIXct", "POSIXt"), tzone = ""), hms2 = structure(c(1457632500, 1457627640,
1457628360, 1457629500, 1457631000, 1457625120, 1457625420, 1457624520,
1457627640, 1457626800, 1457626800, 1457622060, 1457625540), class = c("POSIXct",
"POSIXt"), tzone = "")), .Names = c("Date", "hms1", "hms2"), class = "data.frame", row.names = c(103L, 105L, 108L, 110L, 114L, 117L, 120L, 122L, 127L, 135L, 136L, 138L, 145L))
p1<-ggplot(df1, aes(x=Date,y= hms1))+ scale_x_date(breaks = date_breaks("1 day"))+
geom_linerange(aes(ymin = hms1, ymax = hms2),color = "red",size = 2)+ coord_flip()
p1+ylab("Time")+ggtitle("Activity During Day")
gives desired plot:
Note that the real date, is on column Date while the dates in hms1m hms2 is wrong due to the process that I used to get the time in H:M:S format.
df1$hms1 <- format(df1$time, format = "%H:%M:%S")
df1$hms1 <- as.POSIXct(df1$hms1, format = "%H:%M:%S")
Thing is that most of the "activities", I'm trying to plot span in two day periods. The only solution I have come up with up to now, is shifting both the datetime values by an appropriate period, in order to "artificialy" place them in the same day, as suggested here 2 . This process though, naturally results in the x axis displaying the "artificial" timestamps.
How could I take control of the tick marks in ggplot, so I can make them display the right timestamps? Or even better can someone propose a more elegant solution? To be clear: I want the y-axis to be just the date of the start of the event, but the x-axis to span more than 24 hours.
In this method to create a ggplot with multiple lines, the user needs to first install and import the reshape2 package in the R console and call the melt() function with the required parameters to format the given data to long data form and then use the ggplot() function to plot the ggplot of the formatted data.
You may notice that we sometimes reference 'ggplot2' and sometimes 'ggplot'. To clarify, 'ggplot2' is the name of the most recent version of the package. However, any time we call the function itself, it's just called 'ggplot'.
I'm not sure what your real data structure looks like, so I created some time periods for illustration as follows: I used your values for Date
and then used runif
to select some random start and end times for each activity. Some time periods occur within a single date and some span two dates.
I then plotted the data as follows: Plot horizontal lines that are located vertically at the date when the activity began. The line begins at the time (in hours) when the activity started (relative to Midnight on the date when the activity started), and then extends to the right for however many hours the activity lasted.
The horizontal scale extends to 48 hours, so that activities covering more than one date can be visualized. Vertical lines mark date boundaries, so you can see which activities continued into a second day. If you have activities lasting three or more days, just extend the horizontal scale as needed to accommodate them.
# Convert Date to POSIXct and set HMS to beginning of day (i.e., Midnight).
# To do this, I added 8-hours to account for my locale being the U.S. Pacific time zone.
df1$Date = as.POSIXct(df1$Date) + 3600*8
## Create some times for start and end of each activity
set.seed(20)
df1$hms1a = df1$Date + runif(nrow(df1), 3600*5, 3600*10)
df1$hms2a = df1$Date + runif(nrow(df1), 3600*15, 3600*40)
library(ggplot2)
library(scales)
library(stringr)
ggplot(df1, aes(x=Date)) +
scale_x_datetime(breaks = date_breaks("1 day")) +
scale_y_continuous(limits = c(0,48), breaks=seq(0,48,2),
labels=str_pad(seq(0,48,2) %% 24, 2, pad="0")) +
geom_hline(yintercept=seq(0,48,24)) +
geom_linerange(aes(ymin = hms1a - Date, ymax = hms2a - Date), color = "red",size = 2) +
coord_flip() + ylab("Time (hours)") +
ggtitle("Activity During Day")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With