Why does ggplot keeps giving me the first day of the month then plotting a time series.
Here is a sample of my code:
library(ggplot2)
library(dplyr)
date <- as.Date(c("2008-01-31",
"2008-02-29",
"2008-03-31",
"2008-04-30",
"2008-05-31"))
count <- sample(5)
df <- data.frame(date = date, count = count)
df %>%
ggplot(aes(x = date, y = count))+
geom_line()+
scale_x_date(date_breaks = "1 month",
date_labels = '%m/%d')
I want the x-axis to show the actual date from the df
or the last day of the month. But instead it shows the first day of the next month.
I tried searching for this but could not find a applicable solution.
Thanks.
Perhaps the most straightforward solution is simply to use breaks
instead of date_breaks
, referring directly to your column of dates in the dataframe.
df %>%
ggplot(aes(x = date, y = count))+
geom_line()+
scale_x_date(date_labels = '%m/%d', breaks = df$date)
You and ggplot
are thinking about the dates differently.
You're thinking about the dates like labels. In your example you have 5 things, which you want plotted in order, and those labels should appear on the axis.
ggplot
is thinking about the dates like dates. If you only gave it the values 2 and 5, as they're numeric it would add all the points between them, e.g. 2.5, 3, 4, etc. Since you've given dates, it sticks all the dates in between as well.
The axis labels go off the range of the axis, and have nothing to do with the variable. It's placed the dates in the right spot, but then chosen the axis labels itself.
This leaves you with two options
1.
If you want to stick with the data type "Date", swap the date_break
option with just breaks
and specify the range of what you want. e.g.
scale_x_date(breaks = seq(min(date),max(date),by="month"),
date_labels = '%m/%d')
2.
If you actually want these to be labels (e.g. you don't want to put any points between these dates), consider making date
a factor and just plotting that.
date <- factor(c("2008-01-31",
"2008-02-29",
"2008-03-31",
"2008-04-30",
"2008-05-31"))
count <- sample(5)
df <- data.frame(date = date, count = count)
df %>%
ggplot(aes(x = as.numeric(date), y = count))+
geom_line() +
scale_x_continuous(labels=format(as.Date(date), "%m/%d"))
Wrapping as.numeric
around date in the aes
argument converts the Factor to numeric (so it will draw a line between it), we then just need to set the label to what we want it to be, which requires converting it to a date then formatting it to month/day.
Removing date_breaks
and adding breaks = df$date
seems to give the desired outcome.
df <- data.frame(date = as.POSIXct(date), count = count)
df %>%
ggplot(aes(x = date, y = count)) +
geom_line() +
scale_x_datetime(breaks = df$date, date_labels = '%m/%d')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With