I have a dataframe with one factor column with two levels, and many numeric columns. I want to split the dataframe by the factor column and do t-test on the colunm pairs.
Using the example dataset Puromycin I want the result to look something like this:
Variable Treated Untreated p-value Test-statistic CI of difference****
Conc 0.3450 0.2763 XXX T XX - XX
Rate 141.58 110.7272 xxx T XX - XX
I think I am looking for a solution using PLYR that can an output the above results in a nice dataframe.
(The Puromycin only contains two numeric variables, but the solution I am looking for would work on a dataframe with many numeric variables)
UPDATE - I will try to clarify what i mean.
I would like to go from data that look like this:
Grouping variable var1 var2 var3 var4 var5
1 3 5 7 3 7
1 3 7 5 9 6
1 5 2 6 7 6
1 9 5 7 0 8
1 2 4 5 7 8
1 2 3 1 6 4
2 4 2 7 6 5
2 0 8 3 7 5
2 1 2 3 5 9
2 1 5 3 8 0
2 2 6 9 0 7
2 3 6 7 8 8
2 10 6 3 8 0
To a result dataframe that look like this:
"Mean in group 1" "Mean in group 2" "P-value of difference" "N"
var1 ## ## ## ##
var2 ## ## ## ##
var3 ## ## ## ##
var4 ## ## ## ##
var5 ## ## ## ##
Maybe it is something with mapply I am looking for because i want to split up my dataframe into dataframe1 and dataframe2 by a two-level factor, and apply a function( t-test) to the first parts of dataframe1 and dataframe2, and then a t-test on the second parts of dataframe1 and dataframe2, and then a t-test to the third parts of dataframe1 and dataframe2, and so on on all the column-pairs generated by the split by factor.
Maybe this produces the result you are looking for:
df <- read.table(text="Group var1 var2 var3 var4 var5
1 3 5 7 3 7
1 3 7 5 9 6
1 5 2 6 7 6
1 9 5 7 0 8
1 2 4 5 7 8
1 2 3 1 6 4
2 4 2 7 6 5
2 0 8 3 7 5
2 1 2 3 5 9
2 1 5 3 8 0
2 2 6 9 0 7
2 3 6 7 8 8
2 10 6 3 8 0", header = TRUE)
t(sapply(df[-1], function(x)
unlist(t.test(x~df$Group)[c("estimate","p.value","statistic","conf.int")])))
The result:
estimate.mean in group 1 estimate.mean in group 2 p.value statistic.t conf.int1 conf.int2
var1 4.000000 3.000000 0.5635410 0.5955919 -2.696975 4.696975
var2 4.333333 5.000000 0.5592911 -0.6022411 -3.104788 1.771454
var3 5.166667 5.000000 0.9028444 0.1249164 -2.770103 3.103436
var4 5.333333 6.000000 0.7067827 -0.3869530 -4.497927 3.164593
var5 6.500000 4.857143 0.3053172 1.0925986 -1.803808 5.089522
Maybe you can find this useful
res <- sapply(split(Puromycin[,-3], Puromycin$state), t.test)[c(1:3,5),]
conf.level <- sapply(sapply(split(Puromycin[,-3], Puromycin$state), t.test)[4, ], '[', 1:2)
res <- rbind(res, conf.level.lower=conf.level[1,], conf.level.upper=conf.level[2,])
res
treated untreated
statistic 4.297025 4.206221
parameter 23 21
p.value 0.00026856 0.0003968191
estimate 70.96417 55.50182
conf.level.lower 36.80086 28.06095
conf.level.upper 105.1275 82.94268
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With