Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot: Multiple years on same plot by month

Tags:

plot

r

ggplot2

So, I've hit something I don't think I have every come across. I scoured Google looking for the answer, but have not found anything (yet)...

I have two data sets - one for 2015 and one for 2016. They represent the availability of an IT system. The data frames read as such:

2015 Data Set:

variable value
Jan 2015 100
Feb 2015 99.95
...      ...

2015 Data Set:

variable value
Jan 2016 99.99
Feb 2016 99.90
...      ...

They just go from Jan - Dec listing the availability of the system. The "variable" column is a as.yearmon data type and the value is a simple numeric.

I want to create a geom_line() chart with ggplot2 that will basically have the percentages as the y-axis and the months as the x-axis. I have been able to do this where there are two lines, but the x-axis runs from Jan 2015 - Dec 2016. What I'd like is to have them only be plotted by month, so they overlap. I have tried some various things with the scales and so forth, but I have yet to figure out how to do this.

Basically, I need the x-axis to read January - December in chronological order, but I want to plot both 2015 and 2016 on the same chart. Here is my ggplot code (non-working) as I have it now:

ggplot(data2015,aes(variable,value)) +
geom_line(aes(color="2015")) +
geom_line(data=data2016,aes(color="2016")) +
scale_x_yearmon() +
theme_classic()

This plots in a continuous stream as I am dealing with a yearmon() data type. I have tried something like this:

ggplot(data2015,aes(months(variable),value)) +
geom_line(aes(color="2015")) +
geom_line(data=data2016,aes(color="2016")) +
theme_classic()

Obviously that won't work. I figure the months() is probably still carrying the year somehow. If I plot them as factors() they are not in order. Any help would be very much appreciated. Thank you in advance!

like image 226
azdatasci Avatar asked Dec 18 '16 06:12

azdatasci


Video Answer


2 Answers

To get a separate line for each year, you need to extract the year from each date and map it to colour. To get months (without year) on the x-axis, you need to extract the month from each date and map to the x-axis.

library(zoo)
library(lubridate)
library(ggplot2)

Let's create some fake data with the dates in as.yearmon format. I'll create two separate data frames so as to match what you describe in your question:

# Fake data
set.seed(49)
dat1 = data.frame(date = seq(as.Date("2015-01-15"), as.Date("2015-12-15"), "1 month"),
                 value = cumsum(rnorm(12)))
dat1$date = as.yearmon(dat1$date)

dat2 = data.frame(date = seq(as.Date("2016-01-15"), as.Date("2016-12-15"), "1 month"),
                  value = cumsum(rnorm(12)))
dat2$date = as.yearmon(dat2$date)

Now for the plot. We'll extract the year and month from date with the year and month functions, respectively, from the lubridate package. We'll also turn the year into a factor, so that ggplot will use a categorical color palette for year, rather than a continuous color gradient:

ggplot(rbind(dat1,dat2), aes(month(date, label=TRUE, abbr=TRUE), 
                value, group=factor(year(date)), colour=factor(year(date)))) +
  geom_line() +
  geom_point() +
  labs(x="Month", colour="Year") +
  theme_classic()

enter image description here

like image 109
eipi10 Avatar answered Sep 24 '22 09:09

eipi10


month value year
Jan 99.99 2015
Feb 99.90 2015
Jan 100 2016
Feb 99.95 2016

You need one longform dataset that has a year column. Then you can plot both lines with ggplot

ggplot(dataset, aes(x = month, y = value, color = year)) + geom_line()

like image 44
kabr Avatar answered Sep 24 '22 09:09

kabr