I'm trying to make a scatter plot in R with ggplot2, where the middle of the y-axis is collapsed or removed, because there is no data there. I did it in photoshop below, but is there a way to create a similar plot with ggplot? This is the data with a continuous scale: <img src="https://i.stack.imgur.com/GqsPB.jpg" alt="enter image description here"> But I'm trying to make something like this: <img src="https://i.stack.imgur.com/mYAEC.jpg" alt="enter image description here"> Here is the code: <pre class="prettyprint"><code>ggplot(data=distance_data) + geom_point( aes( x = mdistance, y = maxZ, shape = factor(subj), color = factor(side), size = (cSA) ) ) + scale_size_continuous(range = c(4, 10)) + theme( axis.text.x = element_text(colour = "black", size = 15), axis.text.y = element_text(colour = "black", size = 15), axis.title.x = element_text(colour = "black", size= 20, vjust = 0), axis.title.y = element_text(colour = "black", size= 20), legend.position = "none" ) + ylab("Z-score") + xlab("Distance") </code></pre>

You could do this by defining a coordinate transformation. A standard example are logarithmic coordinates, which can be achieved in <code>ggplot</code> by using <code>scale_y_log10()</code>. But you can also define custom transformation functions by supplying the <code>trans</code> argument to <code>scale_y_continuous()</code> (and similarly for <code>scale_x_continuous()</code>). To this end, you use the function <code>trans_new()</code> from the <code>scales</code> package. It takes as arguments the transformation function and its inverse. I discuss first a special solution for the OP's example and then also show how this can be generalised. <h3>OP's example</h3> The OP wants to shrink the interval between -2 and 2. The following defines a function (and its inverse) that shrinks this interval by a factor 4: <pre class="prettyprint"><code>library(scales) trans <- function(x) { ifelse(x > 2, x - 1.5, ifelse(x < -2, x + 1.5, x/4)) } inv <- function(x) { ifelse(x > 0.5, x + 1.5, ifelse(x < -0.5, x - 1.5, x*4)) } my_trans <- trans_new("my_trans", trans, inv) </code></pre> This defines the transformation. To see it in action, I define some sample data: <pre class="prettyprint"><code>x_val <- 0:250 y_val <- c(-6:-2, 2:6) set.seed(1234) data <- data.frame(x = sample(x_val, 30, replace = TRUE), y = sample(y_val, 30, replace = TRUE)) </code></pre> I first plot it without transformation: <pre class="prettyprint"><code>p <- ggplot(data, aes(x, y)) + geom_point() p + scale_y_continuous(breaks = seq(-6, 6, by = 2)) </code></pre> <img src="https://i.stack.imgur.com/nIeRR.png" alt="enter image description here"> Now I use <code>scale_y_continuous()</code> with the transformation: <pre class="prettyprint"><code>p + scale_y_continuous(trans = my_trans, breaks = seq(-6, 6, by = 2)) </code></pre> <img src="https://i.stack.imgur.com/Aqz23.png" alt="enter image description here"> If you want another transformation, you have to change the definition of <code>trans()</code> and <code>inv()</code> and run <code>trans_new()</code> again. You have to make sure that <code>inv()</code> is indeed the inverse of <code>inv()</code>. I checked this as follows: <pre class="prettyprint"><code>x <- runif(100, -100, 100) identical(x, trans(inv(x))) ## [1] TRUE </code></pre> <h3>General solution</h3> The function below defines a transformation where you can choose the lower and upper end of the region to be squished, as well as the factor to be used. It directly returns the <code>trans</code> object that can be used inside <code>scale_y_continuous</code>: <pre class="prettyprint"><code>library(scales) squish_trans <- function(from, to, factor) { trans <- function(x) { if (any(is.na(x))) return(x) # get indices for the relevant regions isq <- x > from & x < to ito <- x >= to # apply transformation x[isq] <- from + (x[isq] - from)/factor x[ito] <- from + (to - from)/factor + (x[ito] - to) return(x) } inv <- function(x) { if (any(is.na(x))) return(x) # get indices for the relevant regions isq <- x > from & x < from + (to - from)/factor ito <- x >= from + (to - from)/factor # apply transformation x[isq] <- from + (x[isq] - from) * factor x[ito] <- to + (x[ito] - (from + (to - from)/factor)) return(x) } # return the transformation return(trans_new("squished", trans, inv)) } </code></pre> The first line in <code>trans()</code> and <code>inv()</code> handles the case when the transformation is called with <code>x = c(NA, NA)</code>. (It seems that this did not happen with the version of <code>ggplot2</code> when I originally wrote this question. Unfortunately, I don't know with which version this startet.) This function can now be used to conveniently redo the plot from the first section: <pre class="prettyprint"><code>p + scale_y_continuous(trans = squish_trans(-2, 2, 4), breaks = seq(-6, 6, by = 2)) </code></pre> The following example shows that you can squish the scale at an arbitrary position and that this also works for other geoms than points: <pre class="prettyprint"><code>df <- data.frame(class = LETTERS[1:4], val = c(1, 2, 101, 102)) ggplot(df, aes(x = class, y = val)) + geom_bar(stat = "identity") + scale_y_continuous(trans = squish_trans(3, 100, 50), breaks = c(0, 1, 2, 3, 50, 100, 101, 102)) </code></pre> <img src="https://i.stack.imgur.com/KODKj.png" alt="enter image description here"> Let me close by stressing what other already mentioned in comments: this kind of plot could be misleading and should be used with care!

R/ggplot2: Collapse or remove segment of y-axis from scatter-plot

Tags:

r

ggplot2

scatter-plot

I'm trying to make a scatter plot in R with ggplot2, where the middle of the y-axis is collapsed or removed, because there is no data there. I did it in photoshop below, but is there a way to create a similar plot with ggplot? This is the data with a continuous scale: enter image description here

But I'm trying to make something like this: enter image description here

Here is the code:

ggplot(data=distance_data) +
    geom_point(
        aes(
            x = mdistance,
            y = maxZ,
            shape = factor(subj),
            color = factor(side),
            size = (cSA)
        )
    ) +
    scale_size_continuous(range = c(4, 10)) +
    theme(
        axis.text.x = element_text(colour = "black", size = 15),
        axis.text.y = element_text(colour = "black", size = 15),
        axis.title.x = element_text(colour = "black", size= 20, vjust = 0),
        axis.title.y = element_text(colour = "black", size= 20),
        legend.position = "none"
    ) +
    ylab("Z-score") +
    xlab("Distance")

692

asked Feb 19 '16 18:02

Jon

1 Answers

You could do this by defining a coordinate transformation. A standard example are logarithmic coordinates, which can be achieved in ggplot by using scale_y_log10().

But you can also define custom transformation functions by supplying the trans argument to scale_y_continuous() (and similarly for scale_x_continuous()). To this end, you use the function trans_new() from the scales package. It takes as arguments the transformation function and its inverse.

I discuss first a special solution for the OP's example and then also show how this can be generalised.

OP's example

The OP wants to shrink the interval between -2 and 2. The following defines a function (and its inverse) that shrinks this interval by a factor 4:

library(scales)
trans <- function(x) {
  ifelse(x > 2, x - 1.5, ifelse(x < -2, x + 1.5, x/4))
}
inv <- function(x) {
  ifelse(x > 0.5, x + 1.5, ifelse(x < -0.5, x - 1.5, x*4))
}
my_trans <- trans_new("my_trans", trans, inv)

This defines the transformation. To see it in action, I define some sample data:

x_val <- 0:250
y_val <- c(-6:-2, 2:6)
set.seed(1234)
data <- data.frame(x = sample(x_val, 30, replace = TRUE),
                   y = sample(y_val, 30, replace = TRUE))

I first plot it without transformation:

p <- ggplot(data, aes(x, y)) + geom_point()
p + scale_y_continuous(breaks = seq(-6, 6, by = 2))

enter image description here

Now I use scale_y_continuous() with the transformation:

p + scale_y_continuous(trans = my_trans,
                       breaks = seq(-6, 6, by = 2))

enter image description here

If you want another transformation, you have to change the definition of trans() and inv() and run trans_new() again. You have to make sure that inv() is indeed the inverse of inv(). I checked this as follows:

x <- runif(100, -100, 100)
identical(x, trans(inv(x)))
## [1] TRUE

General solution

The function below defines a transformation where you can choose the lower and upper end of the region to be squished, as well as the factor to be used. It directly returns the trans object that can be used inside scale_y_continuous:

library(scales)
squish_trans <- function(from, to, factor) {
  
  trans <- function(x) {
    
    if (any(is.na(x))) return(x)

    # get indices for the relevant regions
    isq <- x > from & x < to
    ito <- x >= to
    
    # apply transformation
    x[isq] <- from + (x[isq] - from)/factor
    x[ito] <- from + (to - from)/factor + (x[ito] - to)
    
    return(x)
  }

  inv <- function(x) {
    
    if (any(is.na(x))) return(x)

    # get indices for the relevant regions
    isq <- x > from & x < from + (to - from)/factor
    ito <- x >= from + (to - from)/factor
    
    # apply transformation
    x[isq] <- from + (x[isq] - from) * factor
    x[ito] <- to + (x[ito] - (from + (to - from)/factor))
    
    return(x)
  }
  
  # return the transformation
  return(trans_new("squished", trans, inv))
}

The first line in trans() and inv() handles the case when the transformation is called with x = c(NA, NA). (It seems that this did not happen with the version of ggplot2 when I originally wrote this question. Unfortunately, I don't know with which version this startet.)

This function can now be used to conveniently redo the plot from the first section:

p + scale_y_continuous(trans = squish_trans(-2, 2, 4),
                       breaks = seq(-6, 6, by = 2))

The following example shows that you can squish the scale at an arbitrary position and that this also works for other geoms than points:

df <- data.frame(class = LETTERS[1:4],
                 val = c(1, 2, 101, 102))
ggplot(df, aes(x = class, y = val)) + geom_bar(stat = "identity") +
  scale_y_continuous(trans = squish_trans(3, 100, 50),
                     breaks = c(0, 1, 2, 3, 50, 100, 101, 102))

enter image description here

Let me close by stressing what other already mentioned in comments: this kind of plot could be misleading and should be used with care!

191

answered Oct 25 '22 18:10

Stibu

Related questions
                            
                                calculate the number of digits in a numeric vector in R
                            
                                clearShapes() not working -- leaflet() for R
                            
                                Zooming into State to view ZipCode using R Leaflet
                            
                                Delay on sliderinput
                            
                                Baffling error using dataprep function in R Synth package
                            
                                Julia version of R's Match?
                            
                                How to invert the colors of a ggmap raster image in R?
                            
                                Adding multiple columns to a data.table, where column names are held in a vector
                            
                                RForcecom accessing unknown field names
                            
                                Applying as.numeric only to elements of a list that can be coerced to numeric (in R)
                            
                                Plot data from lists in R
                            
                                Download Gmail Mail Content using R
                            
                                How to change dendrogram labels in r
                            
                                r - copy value based on match in another column
                            
                                Remove duplicates in two ggplot legend
                            
                                How to use a lookup table in R without creating duplicates?
                            
                                Missing Ribbon in ggplot2
                            
                                How to use the spread function properly in tidyr
                            
                                R Replacing NAs with a unique random numer
                            
                                Q: Create leaflet map in for loop in rmarkdown html

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

R/ggplot2: Collapse or remove segment of y-axis from scatter-plot

Tags:

r

ggplot2

scatter-plot

Jon

People also ask

1 Answers

OP's example

General solution

Stibu

Recent Activity

Donate For Us