Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Rolling mean (moving average) by group/id with dplyr




I have a longitudinal follow-up of blood pressure recordings.

The value at a certain point is less predictive than is the moving average (rolling mean), which is why I'd like to calculate it. The data looks like

test <- read.table(header=TRUE, text = "   ID  AGE   YEAR_VISIT  BLOOD_PRESSURE  TREATMENT   1 20  2000    NA 3   1 21  2001    129 2   1 22  2002    145 3   1 22  2002    130 2   2 23  2003    NA  NA   2 30  2010    150 2   2 31  2011    110 3   4 50  2005    140 3   4 50  2005    130 3   4 50  2005    NA  3   4 51  2006    312 2   5 27  2010    140 4   5 28  2011    170 4   5 29  2012    160 NA   7 40  2007    120 NA                    ") 

I'd like to calculate a new variable, called BLOOD_PRESSURE_UPDATED. This variable should be the moving average for BLOOD_PRESSURE and have the following characteristics:

  • A moving average is the current value plus the previous value divided by two.
  • For the first observation, the BLOOD_PRESSURE_UPDATED is just the current BLOOD_PRESSURE. If that is missing, BLOOD_PRESSURE_UPDATED should be the overall mean.
  • Missing values should be filled in with nearest previous value.

I've tried the following:

test2 <- test %>%   group_by(ID) %>%   arrange(ID, YEAR_VISIT) %>%   mutate(BLOOD_PRESSURE_UPDATED = rollmean(x=BLOOD_PRESSURE, 2)) %>% ungroup() 

I have also tried rollaply and rollmeanr without succeeding.

like image 483
Adam Robinsson Avatar asked Oct 05 '14 00:10

Adam Robinsson

2 Answers

How about this?

    library(dplyr)        test2<-arrange(test,ID,YEAR_VISIT) %>%             mutate(lag1=lag(BLOOD_PRESSURE),                   lag2=lag(BLOOD_PRESSURE,2),                   movave=(lag1+lag2)/2) 

Another solution using 'rollapply' function in zoo package (I like more)

library(dplyr) library(zoo) test2<-arrange(test,ID,YEAR_VISIT) %>%        mutate(ma2=rollapply(BLOOD_PRESSURE,2,mean,align='right',fill=NA)) 
like image 191
hyunwoo jeong Avatar answered Sep 20 '22 09:09

hyunwoo jeong

slider is a 'new-er' alternative that plays nicely with the tidyverse.

Something like this would do the trick

test2 <- test %>%   group_by(ID) %>%   arrange(ID, YEAR_VISIT) %>%   mutate(BLOOD_PRESSURE_UPDATED = slider::slide_dbl(BLOOD_PRESSURE, mean, .before = 1, .after = 0)) %>% ungroup() 
like image 42
elikesprogramming Avatar answered Sep 23 '22 09:09
