Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

7 Day Moving Average per group - R

I have daily data with multiple categorical values, stored as a data frame:

YYYYMM    Date         ID    Count
201401    01/01/2014   A     151
201401    01/01/2014   B     68
201401    01/01/2014   C     487
201401    02/01/2014   A     198
201401    02/01/2014   B     97
201401    02/01/2014   C     403

I am trying to plot a moving average against the actual values, using ggplot.

What I would like to do is create a 5th column in my data frame which has the mean in it.

I have tried this solution (found here: Constructing moving average over a categorical variable in R)

df$Mean<-0
library(plyr)
library(zoo)
ddply(
      df, "ID",
       transform,
        df$Mean<-rollmean(Count, 7, na.pad=TRUE)
     )

and it works, but it calculates the mean for every column in my data frame, and makes another data frame within the existing one, so I end up with something like this:

YYYYMM  Date        ID  Count  Mean.YYYYMM  Mean.Date   Mean.ID  Mean.Count
201401  01/01/2014  A   151    201401       01/01/2014  B        58.90
201401  01/01/2014  B   68     201401       01/01/2014  B        62.05
201401  01/01/2014  C   487    201401       01/01/2014  B        61.84
201401  02/01/2014  A   198    201401       01/01/2014  B        58.02
201401  02/01/2014  B   97     201401       01/01/2014  B        57.65
201401  02/01/2014  C   403    201401       01/01/2014  B        59.65

When I try and plot this

for (var in unique(df$ID))
{
ggplot(df[df$ID==var,], aes(x=Date)) +
        geom_line(aes(y=Count),color="blue") +
        geom_line(aes(y=Mean$Count),color="grey",linetype="dashed") +
        facet_wrap(~ID) +
        theme_bw()
}

I get an error message. I'm wondering what I'm missing here, or if there is another way to go about this?

like image 399
Adam Khan Avatar asked May 26 '17 14:05

Adam Khan


People also ask

How do you do a 7 day moving average in R?

The 7-day moving average for any given day is calculated by taking the average sales of that day and the 6 previous days. The 14-day moving average is calculated by taking the average sales of the day in question and the previous 13 days. In R we can calculate this with a function called rollmean from the zoo package.

What was the 7 day centered moving average?

A 7-day moving average (MA) is a short term trend indicator. It is quite simply the average of closing prices of the last seven trading days.

How does R calculate SMA?

SMA or simple moving average is an arithmetic moving average calculated by adding the recent prices and then dividing that value by the number of time periods in the calculation average.


1 Answers

You didn't supply enough data to create a weekly rolling mean within the groups, but in principle it could work like this:

library(tidyverse)
library(zoo)

my_data <- my_data %>% 
  group_by(ID) %>% 
  mutate(roll_mean = rollmean(Count, 2, na.pad = T))

Using dplyr you group_by your ID variable, and then create a single new column with the rolling mean. You can plot this then with standard ggplot2-syntax:

ggplot(my_data, aes(Date, Count, group = 1)) +
  geom_line(colour = "blue") +
  geom_point(colour = "blue") +
  geom_point(aes(y = roll_mean), colour = "red") +
  facet_wrap(~ID)
#> Warning: Removed 3 rows containing missing values (geom_point).

Data

zzz <- "YYYYMM    Date         ID    Count
201401    01/01/2014   A     151
201401    01/01/2014   B     68
201401    01/01/2014   C     487
201401    02/01/2014   A     198
201401    02/01/2014   B     97
201401    02/01/2014   C     403"

my_data <- read_table(zzz)
like image 162
Thomas K Avatar answered Oct 06 '22 00:10

Thomas K