I would like to calculate the standard deviation of every 4 values down a column from the first to the last observation. I have found lots of answers for moving SD functions, but I simply need a line of code that will calculate the sd()
for every 4 data values and write the answers into a new column in the data frame as below:
Example data:
Obs Count
1 56
2 29
3 66
4 62
5 49
6 12
7 65
8 81
9 73
10 66
11 71
12 59
Desired output:
Obs Count SD
1 56 16.68
2 29 16.68
3 66 16.68
4 62 16.68
5 49 29.55
6 12 29.55
7 65 29.55
8 81 29.55
9 73 6.24
10 66 6.24
11 71 6.24
12 59 6.24
I tried the below code, but this is obviously incorrect:
a <- for(i in 1: length(df)) sd(df$Count[i:(i+3)])
This should be a very easy task, but I have not been able to find an answer. I am still learning and any help would be appreciated.
In base R, you can use the following to create an index of "every 4 rows":
(seq_len(nrow(mydf))-1) %/% 4
# [1] 0 0 0 0 1 1 1 1 2 2 2 2
Using that, you can use ave
to get the desired result:
mydf$SD <- ave(mydf$Count, (seq_len(nrow(mydf))-1) %/% 4, FUN = sd)
mydf
# Obs Count SD
# 1 1 56 16.680827
# 2 2 29 16.680827
# 3 3 66 16.680827
# 4 4 62 16.680827
# 5 5 49 29.545163
# 6 6 12 29.545163
# 7 7 65 29.545163
# 8 8 81 29.545163
# 9 9 73 6.238322
# 10 10 66 6.238322
# 11 11 71 6.238322
# 12 12 59 6.238322
An anternative is using rollapply
from zoo package in combination with rep
.
> library(zoo)
> N <- 4 # every four values
> SDs <- rollapply(df[,2], width=N, by=N, sd)
> df$SD <- rep(SDs, each=N)
> df
Obs Count SD
1 1 56 16.680827
2 2 29 16.680827
3 3 66 16.680827
4 4 62 16.680827
5 5 49 29.545163
6 6 12 29.545163
7 7 65 29.545163
8 8 81 29.545163
9 9 73 6.238322
10 10 66 6.238322
11 11 71 6.238322
12 12 59 6.238322
You might want to get it all in a once:
df$SD <- rep( rollapply(df[,2], width=N, by=N, sd), each=N)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With