So I have a dataframe that looks like this:
Day HL.Average D.Average LL.Average noHKB.Average HL.SD D.SD LL.SD noHKB.SD
1 0.00000 8760 8900 10000 8030 2337.844 924.2742 1120.785 1592.646
2 1.90625 13300 11900 12100 3860 1016.291 2308.2661 3581.763 1031.057
3 3.00000 14500 7320 12300 1750 2945.098 1308.0389 4338.897 1793.583
4 4.00000 16200 9160 15100 2710 1006.893 514.2177 4362.261 2691.648
And I'm trying to organize it into this:
(The numbers in this table is a little different because I made this table in excel instead...)
Day Group Average SD
0 HL 8.76E+03 2337.843612
2 HL 1.33E+04 1016.291341
3 HL 1.45E+04 2945.098059
4 HL 1.62E+04 1006.892976
0 LL 1.00E+04 1120.785009
2 LL 1.21E+04 3581.762692
3 LL 1.23E+04 4338.897375
4 LL 1.51E+04 4362.260616
0 D 8.90E+03 924.2741667
2 D 1.19E+04 2308.266118
3 D 7.32E+03 1308.038881
4 D 9.16E+03 514.2176747
I thought about using stack() and then merge(), but the problem is that my original "xx.Average" and "xx.SD" headings complicates the resulting ind from stack...
Does anyone know of a command/way to easily organize the data?
Thanks so much!
Here is my clumsy solution. It works but it requires some package (dplyr
as well for the pipe operator %>%
but that is just a personal choice.).
It has the melt
function from the reshape2
package, the separate
from tidyr
to separate columns (for example LL.Average
into LL
and Average
) and the dcast
from reshape2
to spread with a formula.
Note that I did not changed the noHKB
but this might be done with an easy step, it is just a personal choice where to do it.
library(reshape2)
library(dplyr)
library(tidyr)
df %>% melt(id.vars = "Day") %>%
separate(variable, c("Group", "Measure"), sep = "\\.") %>%
dcast(...~ Measure) -> df1
df1
Day Group Average SD
1 0.00000 D 8900 924.2742
2 0.00000 HL 8760 2337.8440
3 0.00000 LL 10000 1120.7850
4 0.00000 noHKB 8030 1592.6460
5 1.90625 D 11900 2308.2661
6 1.90625 HL 13300 1016.2910
7 1.90625 LL 12100 3581.7630
8 1.90625 noHKB 3860 1031.0570
9 3.00000 D 7320 1308.0389
10 3.00000 HL 14500 2945.0980
11 3.00000 LL 12300 4338.8970
12 3.00000 noHKB 1750 1793.5830
13 4.00000 D 9160 514.2177
14 4.00000 HL 16200 1006.8930
15 4.00000 LL 15100 4362.2610
16 4.00000 noHKB 2710 2691.6480
@ananda-mahto's splitstackshape package is designed for this type of problem!
Unfortunately it expects the variable names to be prefixes, rather than postfixes, so we will first need to reverse the variable names
library(splitstackshape)
# reverse variable names:
names(df) <- sapply(strsplit(names(df), "\\."), function(x) paste(rev(x), collapse="."))
# Reshape the data frame:
newdf <- merged.stack(df, id.vars=c("Day"), var.stubs=c("Average", "SD"), sep=".")
setnames(newdf, ".time_1", "Group")
newdf
## Day Group Average SD
## 1: 0.00000 D 8900 924.2742
## 2: 0.00000 HL 8760 2337.8440
## 3: 0.00000 LL 10000 1120.7850
## 4: 0.00000 noHKB 8030 1592.6460
## 5: 1.90625 D 11900 2308.2661
## 6: 1.90625 HL 13300 1016.2910
## 7: 1.90625 LL 12100 3581.7630
## 8: 1.90625 noHKB 3860 1031.0570
## 9: 3.00000 D 7320 1308.0389
## 10: 3.00000 HL 14500 2945.0980
## 11: 3.00000 LL 12300 4338.8970
## 12: 3.00000 noHKB 1750 1793.5830
## 13: 4.00000 D 9160 514.2177
## 14: 4.00000 HL 16200 1006.8930
## 15: 4.00000 LL 15100 4362.2610
## 16: 4.00000 noHKB 2710 2691.6480
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With