I have the following problem: I would like to visualize a discrete and a continuous variable on a boxplot in which the latter has a few extreme high values. This makes the boxplot meaningless (the points and even the "body" of the chart is too small), that is why I would like to show this on a log10 scale. I am aware that I could leave out the extreme values from the visualization, but I am not intended to.
Let's see a simple example with diamonds data:
m <- ggplot(diamonds, aes(y = price, x = color))
The problem is not serious here, but I hope you could imagine why I would like to see the values at a log10 scale. Let's try it:
m + geom_boxplot() + coord_trans(y = "log10")
As you can see the y axis is log10 scaled and looks fine but there is a problem with the x axis, which makes the plot very strange.
The problem do not occur with scale_log
, but this is not an option for me, as I cannot use a custom formatter this way. E.g.:
m + geom_boxplot() + scale_y_log10()
My question: does anyone know a solution to plot the boxplot with log10 scale on y axis which labels could be freely formatted with a formatter
function like in this thread?
Editing the question to help answerers based on answers and comments:
What I am really after: one log10 transformed axis (y) with not scientific labels. I would like to label it like dollar (formatter=dollar)
or any custom format.
If I try @hadley's suggestion I get the following warnings:
> m + geom_boxplot() + scale_y_log10(formatter=dollar) Warning messages: 1: In max(x) : no non-missing arguments to max; returning -Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In max(x) : no non-missing arguments to max; returning -Inf
With an unchanged y axis labels:
In your XY (scatter) graph, right-click the scale of each axis and select Format axis.... In the Format Axis box, select the Axis Options tab, and then check Logarithmic scale.
To change the axis scales on a plot in base R Language, we can use the xlim() and ylim() functions. The xlim() and ylim() functions are convenience functions that set the limit of the x-axis and y-axis respectively.
Use scale_xx() functions It is also possible to use the functions scale_x_continuous() and scale_y_continuous() to change x and y axis limits, respectively.
The simplest is to just give the 'trans' (formerly 'formatter') argument of either the scale_x_continuous
or the scale_y_continuous
the name of the desired log function:
library(ggplot2) # which formerly required pkg:plyr m + geom_boxplot() + scale_y_continuous(trans='log10')
EDIT: Or if you don't like that, then either of these appears to give different but useful results:
m <- ggplot(diamonds, aes(y = price, x = color), log="y") m + geom_boxplot() m <- ggplot(diamonds, aes(y = price, x = color), log10="y") m + geom_boxplot()
EDIT2 & 3: Further experiments (after discarding the one that attempted successfully to put "$" signs in front of logged values):
# Need a function that accepts an x argument # wrap desired formatting around numeric result fmtExpLg10 <- function(x) paste(plyr::round_any(10^x/1000, 0.01) , "K $", sep="") ggplot(diamonds, aes(color, log10(price))) + geom_boxplot() + scale_y_continuous("Price, log10-scaling", trans = fmtExpLg10)
Note added mid 2017 in comment about package syntax change:
scale_y_continuous(formatter = 'log10') is now scale_y_continuous(trans = 'log10') (ggplot2 v2.2.1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With