Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ggplot monthly date scale on x axis uses days as units

Tags:

r

ggplot2

When plotting a bar chart with monthly data, ggplot shortens the distance between February and March, making the chart look inconsistenthere is an example

  require(dplyr)
  require(ggplot2)  
  require(lubridate)


## simulating sample data 

set.seed(.1073)
my_df <- data.frame(my_dates = sample(seq(as.Date('2010-01-01'), as.Date('2016-12-31'), 1), 1000, replace = TRUE))


### aggregating + visualizing counts per month 

my_df %>%  
  mutate(my_dates = round_date(my_dates, 'month')) %>%  
  group_by(my_dates) %>%  
  summarise(n_row = n()) %>%  
  ggplot(aes(x = my_dates, y = n_row))+ 
  geom_bar(stat = 'identity', color = 'black',fill = 'slateblue', alpha = .5)+
  scale_x_date(date_breaks = 'months', date_labels = '%y-%b') +
  theme(axis.text.x = element_text(angle = 60, hjust = 1)) 
like image 377
Mouad_Seridi Avatar asked Dec 03 '25 09:12

Mouad_Seridi


2 Answers

I would keep the dates as dates rather than factors. Yes, factors will keep the bars uniform in size but you'll have to remember to join in any months that are missing so that blank months aren't skipped and factors are easy to get out of order. I would recommend adjusting your aesthetics to reduce the effect that the black outline has on the gap between February and March.

Here are two examples:

  • Adjust the outline color to be white. This will reduce the contrast and makes the gap less noticible.
  • Set the width to 20 (days).

As an aside, you don't need to summarize the data, you can use floor_date() or round_date() in an earlier step and go straight into geom_bar().

dates <- seq(as.Date("2010-01-01"), as.Date("2016-12-31"), 1)

set.seed(.1073)
my_df <-
  tibble(
    my_dates = sample(dates, 1000, replace = TRUE),
    floor_dates = floor_date(my_dates, "month")
  )

ggplot(my_df, aes(x = floor_dates)) +
  geom_bar(color = "white", fill = "slateblue", alpha = .5)

ggplot(my_df, aes(x = floor_dates)) +
  geom_bar(color = "black", fill = "slateblue", alpha = .5, width = 20)

enter image description here

like image 129
yake84 Avatar answered Dec 05 '25 03:12

yake84


using some parts from IceCream's answer you can try this. Of note, geom_col is now recommended to use in this case.

my_df %>%  
  mutate(my_dates = factor(round_date(my_dates, 'month'))) %>% 
  group_by(my_dates) %>%  
  summarise(n_row = n()) %>%  
  ungroup() %>% 
  mutate(my_dates_x = as.numeric(my_dates)) %>% 
  mutate(my_dates_label = paste(month(my_dates,label = T), year(my_dates))) %>% 
  {ggplot(.,aes(x = my_dates_x, y = n_row))+ 
   geom_col(color = 'black',width = 0.8, fill = 'slateblue', alpha = .5) +
  scale_x_continuous(breaks = .$my_dates_x, labels = .$my_dates_label) + 
  theme(axis.text.x = element_text(angle = 60, hjust = 1))}

enter image description here

like image 39
Roman Avatar answered Dec 05 '25 04:12

Roman



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!