I'm hoping to create a plot that shows a running average over a scatterplot of the observed data. The data consists of observations of hares' coat color (Color) over time (Julian).
Color Julian
50 85
50 87
50 89
50 90
100 91
50 91
50 92
50 92
100 92
50 93
100 93
50 93
50 95
100 95
50 95
50 96
50 96
50 99
50 100
0 101
0 101
0 103
50 103
50 104
50 104
50 104
50 104
100 104
100 104
50 109
50 109
100 109
0 110
0 110
50 110
50 110
50 110
50 110
0 112
A friend wrote a function for me that calculates a running average of the color observations, but I can't figure out how to add the line (haresAveNoNa) into the plot.
The function:
haresAverage <- matrix( NA, max(hares$Julian), 3 )
for( i in 4:max(hares$Julian) ){
haresAverage[i,1]<-i
haresAverage[i,2]<-mean( hares$Color[ hares$Julian >= (i-3) &
hares$Julian <= (i+3)]
, na.rm=T )
haresAverage[i,3]<-sd( hares$Color[ hares$Julian >= (i-3) &
hares$Julian <= (i+3)]
, na.rm=T )
}
haresAveNoNa <- na.omit( haresAverage)
The plot:
p <- ggplot(hares, aes(Julian, Color))
p +
geom_jitter(width = 1, height = 5, color="blue", alpha = .65)
Can you please help me add the running average 'haresAveNoNa' into the plot? Thanks very much!
How to Compute Rolling Average in R? Let us try to make a plot with rolling average. First, let us use the R package zoo to compute rolling average over a week and plot on top of the barplot. With rollmean() function available in zoo package we can compute rolling average.
Calculating rolling averages To calculate a simple moving average (over 7 days), we can use the rollmean() function from the zoo package. This function takes a k , which is an 'integer width of the rolling window. The code below calculates a 3, 5, 7, 15, and 21-day rolling average for the deaths from COVID in the US.
SMA or simple moving average is an arithmetic moving average calculated by adding the recent prices and then dividing that value by the number of time periods in the calculation average.
You can calculate the rolling mean using rollmean
from the zoo
package instead of writing your own function. You can invoke rollmean
on the fly, within ggplot, to add the rolling mean line, or you can add the rolling mean values to your data frame and then plot them. I provide examples below for both methods. The code below calculates a centered rolling mean with a seven-day window, but you can customize the function for different window sizes and for a left- or right-aligned rolling mean, rather than centered.
ggplot
library(zoo)
ggplot(hares, aes(Julian, Color)) +
geom_point(position=position_jitter(1,3), pch=21, fill="#FF0000AA") +
geom_line(aes(y=rollmean(Color, 7, na.pad=TRUE))) +
theme_bw()
To answer your specific question, let's say you actually do need to add the rolling mean line from separate data, rather than calculate it on the fly. If the rolling mean is another column in your data frame, you just need to give the new column name to geom_line
:
hares$roll7 = rollmean(hares$Color, 7, na.pad=TRUE)
ggplot(hares, aes(Julian, Color)) +
geom_point(position=position_jitter(1,3), pch=21, fill="#FF0000AA") +
geom_line(aes(y=roll7)) +
theme_bw()
If the rolling mean is in a separate data frame, you need to feed that data frame to geom_line
:
haresAverage = data.frame(Julian=hares$Julian,
Color=rollmean(hares$Color, 7, na.pad=TRUE))
ggplot(hares, aes(Julian, Color)) +
geom_point(position=position_jitter(1,3), pch=21, fill="#FF0000AA") +
geom_line(data=haresAverage, aes(Julian, Color)) +
theme_bw()
Julian
valueFirst, convert Julian
to Date format. I don't know the actual mapping from Julian
to date in your data, so for this example let's assume that Julian
is the day of the year, counting the first day of the year as 1, and let's assume the year is 2015.
hares$Date = as.Date(hares$Julian + as.numeric(as.Date("2015-01-01")) - 1)
Now we plot using our new Date
column for the x-axis. To customize both the number of breaks and the date labels, use scale_x_date
.
ggplot(hares, aes(Date, Color)) +
geom_point(position=position_jitter(1,3), pch=21, fill="#FF0000AA") +
geom_line(aes(y=rollmean(Color, 7, na.pad=TRUE))) +
theme_bw() +
scale_x_date(date_breaks="weeks", date_labels="%b %e")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With