Apply t-test on many columns in a dataframe split by factor

Question

I have a dataframe with one factor column with two levels, and many numeric columns. I want to split the dataframe by the factor column and do t-test on the colunm pairs.

Using the example dataset Puromycin I want the result to look something like this:

Variable    Treated Untreated   p-value    Test-statistic CI of difference**** 
Conc        0.3450  0.2763          XXX     T           XX - XX
Rate        141.58  110.7272        xxx     T           XX - XX

I think I am looking for a solution using PLYR that can an output the above results in a nice dataframe.

(The Puromycin only contains two numeric variables, but the solution I am looking for would work on a dataframe with many numeric variables)

UPDATE - I will try to clarify what i mean.

I would like to go from data that look like this:

Grouping variable   var1    var2    var3    var4    var5
1           3   5   7   3   7
1           3   7   5   9   6
1           5   2   6   7   6
1           9   5   7   0   8
1           2   4   5   7   8
1           2   3   1   6   4
2           4   2   7   6   5
2           0   8   3   7   5
2           1   2   3   5   9
2           1   5   3   8   0
2           2   6   9   0   7
2           3   6   7   8   8
2           10  6   3   8   0

To a result dataframe that look like this:

"Mean in group 1"   "Mean in group 2"  "P-value of difference" "N"

var1            ##          ##          ##          ##      
var2            ##          ##          ##          ##  
var3            ##          ##          ##          ##  
var4            ##          ##          ##          ##  
var5            ##          ##          ##          ##

Maybe it is something with mapply I am looking for because i want to split up my dataframe into dataframe1 and dataframe2 by a two-level factor, and apply a function( t-test) to the first parts of dataframe1 and dataframe2, and then a t-test on the second parts of dataframe1 and dataframe2, and then a t-test to the third parts of dataframe1 and dataframe2, and so on on all the column-pairs generated by the split by factor.

Sven Hohenstein · Accepted Answer

Maybe this produces the result you are looking for:

df <- read.table(text="Group   var1    var2    var3    var4    var5
1           3   5   7   3   7
1           3   7   5   9   6
1           5   2   6   7   6
1           9   5   7   0   8
1           2   4   5   7   8
1           2   3   1   6   4
2           4   2   7   6   5
2           0   8   3   7   5
2           1   2   3   5   9
2           1   5   3   8   0
2           2   6   9   0   7
2           3   6   7   8   8
2           10  6   3   8   0", header = TRUE)


t(sapply(df[-1], function(x) 
     unlist(t.test(x~df$Group)[c("estimate","p.value","statistic","conf.int")])))

The result:

     estimate.mean in group 1 estimate.mean in group 2   p.value statistic.t conf.int1 conf.int2
var1                 4.000000                 3.000000 0.5635410   0.5955919 -2.696975  4.696975
var2                 4.333333                 5.000000 0.5592911  -0.6022411 -3.104788  1.771454
var3                 5.166667                 5.000000 0.9028444   0.1249164 -2.770103  3.103436
var4                 5.333333                 6.000000 0.7067827  -0.3869530 -4.497927  3.164593
var5                 6.500000                 4.857143 0.3053172   1.0925986 -1.803808  5.089522

Jilber Urbina · Answer

Maybe you can find this useful

res <- sapply(split(Puromycin[,-3],  Puromycin$state), t.test)[c(1:3,5),]
conf.level <- sapply(sapply(split(Puromycin[,-3],  Puromycin$state), t.test)[4, ], '[', 1:2)
res <- rbind(res, conf.level.lower=conf.level[1,], conf.level.upper=conf.level[2,])
res
                 treated    untreated   
statistic        4.297025   4.206221    
parameter        23         21          
p.value          0.00026856 0.0003968191
estimate         70.96417   55.50182    
conf.level.lower 36.80086   28.06095    
conf.level.upper 105.1275   82.94268

Apply t-test on many columns in a dataframe split by factor

Tags:

dataframe

r

plyr

Rasmus Larsen

2 Answers

Sven Hohenstein

Jilber Urbina

Recent Activity

Donate For Us

Apply t-test on many columns in a dataframe split by factor

Tags:

dataframe

r

plyr

Rasmus Larsen

2 Answers

Sven Hohenstein

Jilber Urbina

Related questions

Recent Activity

Donate For Us