I am trying to reproduce a plot which visualizes the temporal span of a group of electronic tags but have been having little success. I have attached a simple example of the kind of plot I am looking to produce and the data which makes that plot up. Any help generating this plot using ggplot would be extremely helpful.
Please note that in the plot I don't care about year, I simply want to visualize the days and months over which a tag was recording data. Also note that for tags like 4120 which were put out late in the year (September) and continued to produce data through to the begin of the following year (April), that the bar continues through the end of the year and then has another bar that starts in January and visualizes the rest of the tag record.
dat <- structure(list(Tag_Num = c(44386L, 44387L, 44388L, 44390L, 52236L,
52237L, 52238L, 60639L, 60641L, 61921L, 61925L, 61932L, 61936L,
61938L, 61940L, 61957L, 63975L, 63977L, 87565L, 100949L), Deploy = structure(c(1L,
3L, 2L, 9L, 5L, 7L, 14L, 6L, 4L, 13L, 15L, 20L, 10L, 12L, 8L,
19L, 16L, 11L, 18L, 17L), .Label = c("5/4/2004", "5/5/2004",
"5/6/2004", "6/22/2011", "6/24/2005", "6/24/2011", "6/26/2005",
"6/30/2006", "7/3/2004", "9/1/2006", "9/10/2007", "9/11/2007",
"9/12/2006", "9/15/2007", "9/21/2006", "9/22/2006", "9/24/2010",
"9/6/2008", "9/7/2006", "9/9/2006"), class = "factor"), Recover = structure(c(14L,
14L, 14L, 2L, 18L, 17L, 3L, 16L, 15L, 7L, 4L, 12L, 9L, 6L, 13L,
8L, 5L, 11L, 1L, 10L), .Label = c("12/20/2008", "12/31/2004",
"3/14/2008", "3/21/2007", "4/18/2007", "5/12/2008", "5/15/2007",
"5/16/2007", "5/21/2007", "5/22/2011", "5/8/2008", "5/9/2007",
"7/26/2006", "9/10/2004", "9/20/2011", "9/22/2011", "9/25/2005",
"9/8/2005"), class = "factor")), .Names = c("Tag_Num", "Deploy",
"Recover"), class = "data.frame", row.names = c(NA, -20L))
The figure no longer matches the above dataset but still gives an example of what I am trying to accomplish.

I found a solution, although I ended up relying on Julian dates to get this to work. I relied heavily on the lubridate, dplyr, and ggplot2 packages.
I spent a long time figuring out how the dataset should look. If you just have these five points, you could easily make a second row for 4120. Here is a way to do it on the whole dataset using do from dplyr.
require(dplyr)
require(lubridate)
dat2 = dat %>%
group_by(Tag_Num) %>%
do(if(year(mdy(.$Deploy)) - year(mdy(.$Recover)) != 0) {
data.frame(Deploy = c(as.character(.$Deploy), paste("1/1", year(mdy(.$Recover)), sep = "/")),
Recover = c(paste("12/31", year(mdy(.$Deploy)), sep = "/"), as.character(.$Recover))) }
else { data.frame(Deploy = .$Deploy, Recover = .$Recover) } )
Now the dataset looks like:
Tag_Num Deploy Recover 1 4001 1/1/2014 9/1/2014 2 4120 9/1/2013 12/31/2013 3 4120 1/1/2014 4/20/2014 4 4356 1/1/2011 6/29/2011 5 4665 3/15/2010 10/17/2010
I made converted to Julian Day Deploy and Recover dates for the actual plotting. I put year of deployment in, as well, so you could technically do something like color by year in the plot.
dat2 = dat2 %>% ungroup %>%
mutate(year = year(mdy(Deploy)), JDeploy = yday(mdy(Deploy)),
JRecover = yday(mdy(Recover)), Tag_Num = factor(Tag_Num))
Tag_Num Deploy Recover year JDeploy JRecover 1 4001 1/1/2014 9/1/2014 2014 1 244 2 4120 9/1/2013 12/31/2013 2013 244 365 3 4120 1/1/2014 4/20/2014 2014 1 110 4 4356 1/1/2011 6/29/2011 2011 1 180 5 4665 3/15/2010 10/17/2010 2010 74 290
To put months on the x axis instead of Julian Day, I figured out the approximate Julian Day of the middle of each month to use as axis breaks. This seems a little hacky to me, but wasn't sure how else to define the breaks.
# Make breaks in Julian Day that will be equivalent to essentially midmonth?
xbreaks = yday(paste(2013, 1:12, c(15, 14, rep(15, 10)), sep = "-"))
# If want labels at start of each month rather than midmonth
xbreaks2 = yday(paste(2013, 1:12, 1, sep = "-"))
Then plotting with ggplot2. This relies using as.numeric on the factor Tag_Num for use in geom_segment. The y axis breaks labels were then set with the levels of Tag_Num. You could change the order of the y axis be changing the order of the levels of Tag_Num in the dataset.
EDIT
With more tags, the numeric breaks on the y axis no longer represent every single unique tag by default (seen with updated dataset in OP). You can solve this by setting the breaks in scale_y_continuous.
require(ggplot2)
ggplot(dat2, aes(x = JDeploy, xend = JRecover, y = as.numeric(Tag_Num), yend = as.numeric(Tag_Num))) +
geom_segment(size = 5) +
scale_y_continuous(breaks = unique(as.numeric(dat2$Tag_Num)), labels = paste("Tag", levels(dat2$Tag_Num))) +
ylab(NULL) +
xlab(NULL) +
scale_x_continuous(breaks = xbreaks2, labels = format(ISOdate(2004,1:12,1),"%b"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With