Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot - x-axis shows data beyond specified range for longer time periods

Tags:

graph

r

ggplot2

Here's some sample data for a company's Net revenue split by two cohorts:

data <- data.frame(dates = rep(seq(as.Date("2000/1/1"), by = "month", length.out = 48), each = 2),
                   revenue = rep(seq(10000, by = 1000, length.out = 48), each = 2) * rnorm(96, mean = 1, sd = 0.1),
                   cohort = c("Group 1", "Group 2"))

I can show one year's worth of data and it returns what I would expect:

start = "2000-01-01"
end = "2000-12-01"

ggplot(data, aes(fill = cohort, x = dates, y = revenue)) +
    geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
    xlab("Month") + 
    ylab("Net Revenue") +
    geom_text(aes(label = round(revenue, 0)), vjust = -0.5, size = 3, position = position_dodge(width = 25)) + 
    scale_x_date(date_breaks = "1 month", limits = as.Date(c(start, end))) +
    ggtitle("Monthly Revenue by Group") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title = element_text(hjust = 0.5)) +
    scale_fill_manual(values=c("#00BFC4", "#F8766D"))

enter image description here

But if I expand the date range to two years or more and rerun the graph, it shows additional months on both sides of the x-axis despite not displaying any information on the y-axis.

start = "2000-01-01"
end = "2001-12-01"
#rerun the ggplot code from above

Note the non-existant data points for 1999-12-01 and 2002-01-01. Why do these appear and how can I remove them?

like image 816
Nicholas Hassan Avatar asked Mar 03 '23 19:03

Nicholas Hassan


1 Answers

Many (all?) of the scale_* functions take expand= as an argument. It's common in R plots (both base and ggplot2) to expand the axes just a little bit (4% on each end, I believe), I think so that none of the lines/points are scrunched up against the "box" boundary.

If you include expand=c(0,0), you get what you want.

(BTW: you have mismatched parens. Fixed here.)

ggplot(data, aes(fill = cohort, x = dates, y = revenue)) +
    geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
    xlab("Month") + 
    ylab("Net Revenue") +
    geom_text(aes(label = round(revenue, 0)), vjust = -0.5, size = 3, position = position_dodge(width = 25)) + 
    scale_x_date(date_breaks = "1 month", limits = as.Date(c(start, end)), expand = c(0, 0)) +
    ggtitle("Monthly Revenue by Group") +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 60, hjust = 1), plot.title = element_text(hjust = 0.5)) +
    scale_fill_manual(values=c("#00BFC4", "#F8766D"))

better ggplot

like image 109
r2evans Avatar answered Mar 05 '23 14:03

r2evans