Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating cumulative standard deviation by group using R

Tags:

r

I am pretty new to R and wanted to calculate the cumulative standard deviation by group in R. I have a data frame D which has an ID for visitor and the corresponding time on page (top) spent in each page as below

ID   top
v1   2.3  
v1   4.8
v1   10.2
v2   16.2
v2   12.2
v2   14.3
v2   12.4
v3   8.2
v3   8.8

The output needs to look like this

ID   top  cum_sd
v1   2.3  
v1   4.8   1.76
v1   10.2  4.03
v2   16.2
v2   12.2  2.82
v2   14.3  2.00
v2   12.4  1.15
v3   8.2   
v3   8.8   0.42

Thank you for the help in advance.

like image 920
rgo Avatar asked Sep 11 '25 09:09

rgo


2 Answers

We can use runSD from TTR. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', we apply the runSD on the 'top' column and assign (:=) the output to create the 'cum_sd'.

library(data.table)
library(TTR)
setDT(df1)[, cum_sd := round(runSD(top, n=1, cumulative=TRUE),2) ,ID]
df1
#  ID  top cum_sd
#1: v1  2.3     NA
#2: v1  4.8   1.77
#3: v1 10.2   4.04
#4: v2 16.2     NA
#5: v2 12.2   2.83
#6: v2 14.3   2.00
#7: v2 12.4   1.87
#8: v3  8.2     NA
#9: v3  8.8   0.42
like image 115
akrun Avatar answered Sep 14 '25 02:09

akrun


You can do it with base functions:

cumsd <- function(x) sapply(sapply(seq_along(x), head, x=x), sd)
df1$cum_sd <- ave(df1$top, df1$ID, FUN=cumsd)
like image 43
jogo Avatar answered Sep 14 '25 00:09

jogo