Using the following dataframe I would like to group the data by replicate and group and then calculate a ratio of treatment values to control values.
structure(list(group = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), .Label = c("case", "controls"), class = "factor"), treatment = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "EPA", class = "factor"),
replicate = structure(c(2L, 4L, 3L, 1L, 2L, 4L, 3L, 1L), .Label = c("four",
"one", "three", "two"), class = "factor"), fatty_acid_family = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "saturated", class = "factor"),
fatty_acid = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "14:0", class = "factor"),
quant = c(6.16, 6.415, 4.02, 4.05, 4.62, 4.435, 3.755, 3.755
)), .Names = c("group", "treatment", "replicate", "fatty_acid_family",
"fatty_acid", "quant"), class = "data.frame", row.names = c(NA,
-8L))
I have tried using dplyr as follows:
group_by(dataIn, replicate, group) %>% transmute(ratio = quant[group=="case"]/quant[group=="controls"])
but this results in Error: incompatible size (%d), expecting %d (the group size) or 1
Initially I thought this might be because I was trying to create 4 ratios from a df 8 rows deep and so I thought summarise
might be the answer (collapsing each group to one ratio) but that doesn't work either (my understanding is a shortcoming).
group_by(dataIn, replicate, group) %>% summarise(ratio = quant[group=="case"]/quant[group=="controls"])
replicate group ratio
1 four case NA
2 four controls NA
3 one case NA
4 one controls NA
5 three case NA
6 three controls NA
7 two case NA
8 two controls NA
I would appreciate some advice on where I'm going wrong or even if this can be done with dplyr
.
Thanks.
Groupby Function in R – group_by is used to group the dataframe in R. Dplyr package in R is provided with group_by() function which groups the dataframe by multiple columns with mean, sum and other functions like count, maximum and minimum.
You can try:
group_by(dataIn, replicate) %>%
summarise(ratio = quant[group=="case"]/quant[group=="controls"])
#Source: local data frame [4 x 2]
#
# replicate ratio
#1 four 1.078562
#2 one 1.333333
#3 three 1.070573
#4 two 1.446449
Because you grouped by replicate and group, you could not access data from different groups at the same time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With