How can I add lines to connect points on regression line to both x and y axis on ggplot?

Tags:

ggplot2

How can I add lines connecting a regression equation to specific points in the x axis and to the corresponding values on the y axis?

Here is a reproducible example:

library(ggplot2)
library(ggpmisc)

x<-c(1,2,3,5,10,12,15,20,22,25,30,33,37)

y<-c(1000,800,100,10,1,0.3,0.25,0.2,0.1,0.1,0.03,0.05,0.03)

myformula<-y ~ poly(x,3)

df <- data.frame(x, y)

ggplot(df, aes(x,y)) + 
  stat_smooth(method = lm, formula = myformula) + 
  geom_point() + 
  stat_smooth(method = lm, formula = myformula) +
  stat_poly_eq(formula = myformula, eq.with.lhs = "italic(psi)~`=`~",
               eq.x.rhs = "~italic(theta)", 
               aes(label = paste(..eq.label.., ..rr.label.., 
                                 sep = "~~~~")), label.x=0.15, parse = TRUE)+ 
  xlim(0, 40)+  
  ylim(0, 2000)+ 
  scale_y_log10(breaks = c(0, 0.1,10,1000), labels= c(0,0.1, 10,1000))

This is what I have: enter image description here

And this is what I would like: enter image description here

510

asked Feb 11 '21 04:02

2 Answers

You'll first want to save your plot for later use, here I'm saving it into object p (I'm ignoring the stuff that's not related to your question).

p <- ggplot(df, aes(x,y)) + 
  stat_smooth(method = lm, formula = myformula) + 
  geom_point() + 
  xlim(0, 40) +  
  scale_y_log10(breaks = c(0, 0.1,10,1000), labels= c(0,0.1, 10,1000))

The ggplot2 package has a function ggplot_build() which allows you to observe all the makings of the plot.

plot_str <- ggplot_build(p)

The object created is a list, which has a data element, which is in itself a list of all the data frames for each of the geoms used to build the plot. Here we are interested in the line chart, which is the 2nd data frame in that list.

head(plot_str$data[[2]])
         x        y     ymin     ymax        se flipped_aes PANEL group  colour   fill size linetype weight alpha
1 1.000000 3.019354 2.645929 3.392780 0.1650750       FALSE     1    -1 #3366FF grey60    1        1      1   0.4
2 1.455696 2.796358 2.458116 3.134599 0.1495218       FALSE     1    -1 #3366FF grey60    1        1      1   0.4
3 1.911392 2.581749 2.273151 2.890348 0.1364177       FALSE     1    -1 #3366FF grey60    1        1      1   0.4
4 2.367089 2.375366 2.090702 2.660030 0.1258375       FALSE     1    -1 #3366FF grey60    1        1      1   0.4
5 2.822785 2.177044 1.910571 2.443518 0.1177963       FALSE     1    -1 #3366FF grey60    1        1      1   0.4
6 3.278481 1.986621 1.732776 2.240466 0.1122137       FALSE     1    -1 #3366FF grey60    1        1      1   0.4

Now we can just grab a couple of points. Here I'm grabbing the 5th and the 70th row.

specific_points <- plot_str$data[[2]][c(5, 70), ]

And then getting back to the earlier version of the plot, I'm adding a few segment geoms referencing those points.

p + 
  geom_segment(y = specific_points$y[1], yend = specific_points$y[1], x = -Inf, xend = specific_points$x[1]) + 
  geom_segment(y = specific_points$y[1], yend = -Inf, x = specific_points$x[1], xend = specific_points$x[1], linetype = "dashed") + 
  geom_segment(y = specific_points$y[2], yend = specific_points$y[2], x = -Inf, xend = specific_points$x[2]) + 
  geom_segment(y = specific_points$y[2], yend = -Inf, x = specific_points$x[2], xend = specific_points$x[2], linetype = "dashed")

enter image description here

109

answered Nov 11 '22 06:11

Phil

(My answer doesn't quite work yet, because I haven't figured out how to replicate the scale transformation ggplot2 does before it fits the poly line. I'd love to hear any suggestions!)

My approach was to fit the curve outside of ggplot, and use those results to drive the annotations. Here's a table showing the coordinates of two points on the fitted poly line:

points <- c(4, 34)
lines <- data.frame(
  x = points,
  y = predict(lm(myformula), data.frame(x = points))
)
#lines
#   x       y
#1  4 370.537
#2 34  41.233

Then we can feed those into geom_segment:

ggplot(df, aes(x,y)) + 
  stat_smooth(method = lm, formula = myformula) + 
  geom_point() + 
  stat_poly_eq(formula = myformula, eq.with.lhs = "italic(psi)~`=`~",
               eq.x.rhs = "~italic(theta)", 
               aes(label = paste(..eq.label.., ..rr.label.., 
                                 sep = "~~~~")), label.x=0.15, parse = TRUE)+ 
  geom_segment(data = lines, lty = "dashed",
               aes(x = x, xend = x, y = 0, yend = y)) +
  geom_segment(data = lines, lty = "dashed",
               aes(x = 0, xend = x, y = y, yend = y))

enter image description here

Unfortunately, this doesn't work if y is transformed, as in the original question. I learned that ggplot2 transforms the scales before it fits the model, so the fit ggplot2 is using will vary from the untransformed version.

Here we see that the transformed data after adding scale_y_log10(breaks = c(0, 0.1,10,1000), labels= c(0,0.1, 10,1000)) can be fit more closely to a cubic curve than the untransformed data, with r^2 increasing to 0.98. However, the old line segments won't work any more since the fit has changed. I will leave the correction of that fit to the reader since I can't figure it out.

enter image description here

answered Nov 11 '22 06:11

Jon Spring

Related questions
                            
                                Can't suppress messages in blogdown/knitr
                            
                                Filter by testing logical condition across multiple columns
                            
                                Giving the list returned by purrr::map names
                            
                                Dynamically update input dataframe at each iteration of function without global assignment
                            
                                Fusing arrows sideways in mermaid diagrams
                            
                                Group_by and mutate slow on large dataframe
                            
                                fuzzy join with stringdist_join() in R, Error: NAs are not allowed in subscripted assignments
                            
                                R Shiny: Remove Row Button in Data Table
                            
                                How to get average marginal effects (AMEs) with standard errors of a multinomial logit model?
                            
                                R unable to process heavy tasks for many hours
                            
                                Inline LaTeX equations in shiny app with MathJax
                            
                                Including images on axis label in an animated ggplot2
                            
                                Create a col_types string specification for read_csv based on existing dataframe
                            
                                tidy eval vs base or get() vs sym() vs as.symbol()
                            
                                Reading in HTML/XML PDF file formats into R
                            
                                E-mail (or similar) notification when code execution is finished
                            
                                "failed to find package directory" error on Travis-CI while computing code coverage
                            
                                How to get line number of a function call in R?
                            
                                How to get session token when authenticating to JSON REST API (in R)
                            
                                calculate and plot vector field of an arbitrary rasterLayer

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I add lines to connect points on regression line to both x and y axis on ggplot?

Tags:

r

ggplot2

paml

People also ask

2 Answers

Phil

Jon Spring

Recent Activity

Donate For Us