Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I get geom_smooth() to allow line breaks when there are NA values?

I'm hoping to find a way to for line breaks to show up while using geom_smooth() - is this possible?

Here's sample data and code I'm using and the resulting plot:

game_number <- c(1:52)

toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
        16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7, 
        15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)

toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)
plot <- ggplot(toi_df, aes(x = game_number, y = toi, group = player, colour = player)) +
            geom_line(size = 0.6) +
            geom_smooth(se = F, size = 1) +
            scale_y_continuous(limits = c(0, 25), expand = c(0, 0))

The resulting plot looks like this. You can see the NA line breaks in in geom_line(), but the geom_smooth() line is connecting over the NA values. Is there a way to get geom_smooth() to behave like geom_line() in this scenario? Or some other ggplot command to use instead? Thank you!

geom_smooth() ignoring line breaks for NA values

like image 867
Zach Ellenthal Avatar asked Sep 16 '25 18:09

Zach Ellenthal


1 Answers

I would suggest one approach where you can compute the geom_smooth() output in a independent dataframe and then merge with original data. Here an approach using broom and tidyverse packages:

library(tidyverse)
library(broom)

First the data:

#Data
game_number <- c(1:52)
toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
         16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7, 
         15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)
toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)

Now, we compute the smooth model:

#Create smooth
model <- loess(toi ~ game_number, data = toi_df)

We create a dataframe to save the results:

#Augment model output in a new dataframe
toi_df2 <- augment(model, toi_df)

We merge the data:

#Merge data
toi_df3 <- merge(toi_df,
                 toi_df2[,c("player","game_number",".fitted")],
                 by=c("player","game_number"),all.x = T)

Finally, we plot using geom_line():

#Plot
ggplot(toi_df3, aes(x = game_number, y = toi, group = player, colour = player)) +
  geom_line(size = 0.6) +
  geom_line(aes(y=.fitted),size=1) +
  scale_y_continuous(limits = c(0, 25), expand = c(0, 0))

Output:

enter image description here

The approach can work if you have more than one players. In that case you can group by players (group_by() from dplyr) and using do() function to estimate the smooth models for each player.

Update:

I add a code for multi players. In this case I have created a function to iterate across groups defined by player in a list. After creating the function you have to use split() to get a list with each player. The function myfunsmooth() compute loess. Then, you bind the data and sketch the plot. Here the code:

The dummy data:

#Data
game_number <- c(1:52)
toi <- c(NA, NA, NA, NA, 20.4, 20.2, 19.4, 18.6, 17.8, 17.1, 17.7, 17.3, 16.8, 17.1, 17.8, 17.3, 16.6,
         16.9, 17.4, 16.9, 16.1, 16.6, 16.9, 16.4, NA, NA, NA, NA, NA, NA, 16.9, 18.2, 18.5, 16.6, 16.3, 15.7, 
         15.1, 14.7, 16.5, 17.9, 16.9, NA, 17.6, 18.1, 17.9, 17.2, 18.2, 18.0, 17.3, 17.8, 18.3, 17.9)
toi_df <- tibble(player = 'Nils Lundkvist', game_number = game_number, toi = toi)
toi_df0 <- tibble(player = 'Zach Ellenthal', game_number = game_number, toi = toi)
toi_df0$toi <- toi_df0$toi+15 
toi_dfm <- rbind(toi_df,toi_df0)

The function for loess():

#Function for smoothing
myfunsmooth <- function(x)
{
  #Model
  model <- loess(toi ~ game_number, data = x)
  #Augment model output in a new dataframe
  y <- augment(model, x)
  #Merge data
  z <- merge(x,y[,c("player","game_number",".fitted")],
                   by=c("player","game_number"),all.x = T)
  #Return
  return(z)
}

Then, we create the list:

#Create list by player
List <- split(toi_dfm,toi_dfm$player)

We apply the function and bind the results in a new dataframe:

#Apply function
List2 <- lapply(List, myfunsmooth)
#Bind all
dfglobal <- do.call(rbind,List2)
rownames(dfglobal)<-NULL

Finally, we plot:

#Plot
ggplot(dfglobal, aes(x = game_number, y = toi, group = player, colour = player)) +
  geom_line(size = 0.6) +
  geom_line(aes(y=.fitted),size=1) +
  scale_y_continuous(limits = c(0, 45), expand = c(0, 0)) 

Output:

enter image description here

like image 95
Duck Avatar answered Sep 19 '25 10:09

Duck