Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - ggplot2 time series x-axis to show last day of the month

Why does ggplot keeps giving me the first day of the month then plotting a time series.
Here is a sample of my code:

library(ggplot2)
library(dplyr)
date <- as.Date(c("2008-01-31",
"2008-02-29",
"2008-03-31",
"2008-04-30",
"2008-05-31"))


count <- sample(5)
df <- data.frame(date = date, count = count)
df %>% 
  ggplot(aes(x = date, y = count))+
  geom_line()+
  scale_x_date(date_breaks = "1 month",
               date_labels = '%m/%d')  

I want the x-axis to show the actual date from the df or the last day of the month. But instead it shows the first day of the next month.
I tried searching for this but could not find a applicable solution.

Thanks.

like image 805
jmich738 Avatar asked Apr 11 '18 03:04

jmich738


3 Answers

Perhaps the most straightforward solution is simply to use breaks instead of date_breaks, referring directly to your column of dates in the dataframe.

df %>% 
  ggplot(aes(x = date, y = count))+
  geom_line()+
  scale_x_date(date_labels = '%m/%d', breaks = df$date)

enter image description here

like image 143
Marcus Campbell Avatar answered Sep 30 '22 20:09

Marcus Campbell


You and ggplot are thinking about the dates differently.

You're thinking about the dates like labels. In your example you have 5 things, which you want plotted in order, and those labels should appear on the axis.

ggplot is thinking about the dates like dates. If you only gave it the values 2 and 5, as they're numeric it would add all the points between them, e.g. 2.5, 3, 4, etc. Since you've given dates, it sticks all the dates in between as well.

The axis labels go off the range of the axis, and have nothing to do with the variable. It's placed the dates in the right spot, but then chosen the axis labels itself.

This leaves you with two options

1.

If you want to stick with the data type "Date", swap the date_break option with just breaks and specify the range of what you want. e.g.

scale_x_date(breaks = seq(min(date),max(date),by="month"),
             date_labels = '%m/%d')

2.

If you actually want these to be labels (e.g. you don't want to put any points between these dates), consider making date a factor and just plotting that.

date <- factor(c("2008-01-31",
              "2008-02-29",
              "2008-03-31",
              "2008-04-30",
              "2008-05-31"))

count <- sample(5)
df <- data.frame(date = date, count = count)
df %>% 
  ggplot(aes(x = as.numeric(date), y = count))+
  geom_line() +
  scale_x_continuous(labels=format(as.Date(date), "%m/%d"))

Wrapping as.numeric around date in the aes argument converts the Factor to numeric (so it will draw a line between it), we then just need to set the label to what we want it to be, which requires converting it to a date then formatting it to month/day.

like image 31
LachlanO Avatar answered Sep 30 '22 21:09

LachlanO


Removing date_breaks and adding breaks = df$date seems to give the desired outcome.

df <- data.frame(date = as.POSIXct(date), count = count)
df %>% 
  ggplot(aes(x = date, y = count)) +
  geom_line() +
  scale_x_datetime(breaks = df$date, date_labels = '%m/%d') 

enter image description here

like image 22
Suren Avatar answered Sep 30 '22 21:09

Suren