I have a dataset with some outliers, such as the following
x <- rnorm(1000,0,20)
x <- c(x, 500, -500)
If we plot this on a linear x axis scale at this we see
histogram(x)
I worked out a nice way to put it on a log scale using this useful thread: how to use a log scale for y-axis of histogram in R? :
mat <- data.frame(x)
ggplot(ee, aes(x = xx)) + geom_histogram(colour="darkblue", size=1, fill="blue") + scale_x_log10()
However, I would like the x axis labels from this 2nd example to match that of the first example, except with a kind of "negative log" - i.e. first tick (moving from the centre to the left) could be -1, then the next could be -10, the next -100, but all equidistant. Does that make sense?
Bottom line: A logarithmic axis can only plot positive values. There simply is no way to put negative values or zero on a logarithmic axis.
A logarithmic axis compresses the range in a non-linear fashion, which means that variable width bins have to be used for histograms and the y-axis represents density (not a count). Taking logs and using the result to plot a histogram usually produces a curve having a distorted shape, not twin peaks.
There are two main reasons to use logarithmic scales in charts and graphs. The first is to respond to skewness towards large values; i.e., cases in which one or a few points are much larger than the bulk of the data. The second is to show percent change or multiplicative factors.
I am not sure I understand your goal, but when you want a log-like transformation yet have zeroes or negative values, the inverse hyperbolic sine transformation asinh()
is often a good option. It is log-like for large values and is defined for all real values. See Rob Hyndman's blog and this question on stats.stackexchange.com for discussion, details, and other options.
If this is an acceptable approach, you can create a custom scale for ggplot. The code below demonstrates how to create and use a custom scale (with custom breaks), along with a visualization of the asinh() transformation.
library(ggplot2)
library(scales)
limits <- 100
step <- 0.005
demo <- data.frame(x=seq(from=-1*limits,to=limits,by=step))
asinh_trans <- function(){
trans_new(name = 'asinh', transform = function(x) asinh(x),
inverse = function(x) sinh(x))
}
ggplot(demo,aes(x,x))+geom_point(size=2)+
scale_y_continuous(trans = 'asinh',breaks=c(-100,-50,-10,-1,0,1,10,50,100))+
theme_bw()
ggplot(demo,aes(x,x))+geom_point(size=2)+
scale_x_continuous(trans = 'asinh',breaks=c(0,1,10,50,100))+
scale_y_log10(breaks=c(0,1,10,50,100))+ # zero won't plot
xlab("asinh() scale")+ylab("log10 scale")+
theme_bw()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With