Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Formatting ggplot2 axis labels with commas (and K? MM?) if I already have a y-scale

I am trying to format Cost and Revenue (both in thousands) and Impressions (in millions) data for a ggplot graph's y-axis labels.

My plot runs from 31 days ago to 'yesterday' and uses the min and max values over that period for the ylim(c(min,max)) option. Showing just the Cost example,

library(ggplot2) library(TTR)  set.seed(1984)  #make series start <- as.Date('2016-01-01') end <- Sys.Date()  days <- as.numeric(end - start)  #make cost and moving averages cost <- rnorm(days, mean = 45400, sd = 11640) date <- seq.Date(from = start, to = end - 1, by = 'day')  cost_7 <- SMA(cost, 7) cost_30 <- SMA(cost, 30)  df <- data.frame(Date = date, Cost = cost, Cost_7 = cost_7, Cost_30 = cost_30)   # set parameters for window left <- end - 31 right <- end - 1  # plot series ggplot(df, aes(x = Date, y = Cost))+ geom_line(lwd = 0.5) + geom_line(aes(y = Cost_7), col = 'red', linetype = 3, lwd = 1) + geom_line(aes(y = Cost_30), col = 'blue', linetype = 5, lwd = 0.75) + xlim(c(left, right)) +  ylim(c(min(df$Cost[df$Date > left]), max(df$Cost[df$Date > left]))) + xlab("") 

ggplot output

I would a) like to represent thousands and millions on the y-axis with commas, and b) like those numbers abbreviated and with 'K' for thousands or 'MM' for millions. I realize b) may be a tall order, but for now a) cannot be accomplished with

ggplot(...) + ... + ylim(c(min, max)) + scale_y_continuous(labels = comma)

Because the following error is thrown:

## Scale for 'y' is already present. Adding another scale for 'y', which ## will replace the existing scale. 

I have tried putting the scale_y_continuous(labels = comma) section after the geom_line()layer (which throws the error above) or at the end of all the ggplot layers, which overrides my limits in the ylim call and then throws the error above, anyway.

Any ideas?

like image 363
d8aninja Avatar asked Jun 08 '16 21:06

d8aninja


People also ask

How do you change the Y axis labels in R?

The easiest way to change the Y-axis title in base R plot is by using the ylab argument where we can simply type in the title. But the use of ylab does not help us to make changes in the axis title hence it is better to use mtext function, using which we can change the font size, position etc.

Which argument can be used to add labels in Ggplot?

Method 1: Using geom_text() This method is used to add Text labels to data points in ggplot2 plots. It positions in the same manner as geom_point() does. Parameters: label: Text labels we want to show at data points.

How do I add a superscript in ggplot2?

To add superscript as a title add bquote function with value inside ggtitle(). Parameter : like xlab and ylab functions, we can give the title for plot directly using this function. Here we will bquote() function for writing Superscript value ( Number VS Number2 ) as a title of plot.

How do you add a title and axis label in R?

Add Titles to a Graph in R Programming – title() Function title() function in R Language is used to add main title and axis title to a graph. This function can also be used to modify the existing titles. Syntax: title(main = NULL, sub = NULL, xlab = NULL, ylab = NULL, …)


1 Answers

For the comma formatting, you need to include the scales library for label=comma. The "error" you discussed is actually just a warning, because you used both ylim and then scale_y_continuous. The second call overrides the first. You can instead set the limits and specify comma-separated labels in a single call to scale_y_continuous:

library(scales)  ggplot(df, aes(x = Date, y = Cost))+   geom_line(lwd = 0.5) +   geom_line(aes(y = Cost_7), col = 'red', linetype = 3, lwd = 1) +   geom_line(aes(y = Cost_30), col = 'blue', linetype = 5, lwd = 0.75) +   xlim(c(left, right)) +    xlab("") +   scale_y_continuous(label=comma, limits=c(min(df$Cost[df$Date > left]),                                             max(df$Cost[df$Date > left]))) 

Another option would be to melt your data to long format before plotting, which reduces the amount of code needed and streamlines aesthetic mappings:

library(reshape2)  ggplot(melt(df, id.var="Date"),         aes(x = Date, y = value, color=variable, linetype=variable))+   geom_line() +   xlim(c(left, right)) +    labs(x="", y="Cost") +   scale_y_continuous(label=comma, limits=c(min(df$Cost[df$Date > left]),                                             max(df$Cost[df$Date > left]))) 

Either way, to put the y values in terms of thousands or millions you could divide the y values by 1,000 or 1,000,000. I've used dollar_format() below, but I think you'll also need to divide by the appropriate power of ten if you use unit_format (per @joran's suggestion). For example:

div=1000  ggplot(melt(df, id.var="Date"),         aes(x = Date, y = value/div, color=variable, linetype=variable))+   geom_line() +   xlim(c(left, right)) +    labs(x="", y="Cost (Thousands)") +   scale_y_continuous(label=dollar_format(),                      limits=c(min(df$Cost[df$Date > left]),                                max(df$Cost[df$Date > left]))/div) 

Use scale_color_manual and scale_linetype_manual to set custom colors and linetypes, if desired.

enter image description here

like image 107
eipi10 Avatar answered Oct 08 '22 03:10

eipi10