I have a dataframe of which I put one variable into a vector.
From this vector, I would like to calculate for every 5 values mean, min and max value.
I have managed to calculate the means in this way:
means <- colMeans(matrix(df$values, nrow=5))
I know I can calculate the min and max like this:
max <- max(df$values[1:5])
min <- min(df$values[1:5])
How do I repeat this for every five values?
Aditionally, how can I get statistic and p-value from a 1-sample t-test for each n-row?
1) tapply Below g is a grouping variable consisting of fives ones, fives twos and so on. range provides the minimum and maximum resulting in a list output from tapply and then simplify2array reduces that to an array. Omit the simlify2array if you want a list output. out[1, ] would be the minima and out[2, ] would be the maxima.
values <- 1:100 # test input
n <- length(values)
g <- rep(1:n, each = 5, length = n)
out <- simplify2array(tapply(values, g, range))
giving:
> out
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
[1,] 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
[2,] 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
2) aggregate This would also work:
ag <- aggregate(values, list(g = g), range)
giving this data.frame where the first column is g and the second column is the transpose of the matrix in (1). Here ag[[2]][, 1] is the minima and ag[[2]][, 2] is the maxima. If you want to flatten ag try do.call(data.frame, ag) or do.call(cbind, ag) depending on whether you want a 3 column data frame or matrix.
> ag
g x.1 x.2
1 1 1 5
2 2 6 10
3 3 11 15
4 4 16 20
5 5 21 25
6 6 26 30
7 7 31 35
8 8 36 40
9 9 41 45
10 10 46 50
11 11 51 55
12 12 56 60
13 13 61 65
14 14 66 70
15 15 71 75
16 16 76 80
17 17 81 85
18 18 86 90
19 19 91 95
20 20 96 100
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With