Is it possible somehow to do a t.test over multiple variables against the same categorical variable without going through a reshaping of the dataset as follows? <pre class="prettyprint"><code>data(mtcars) library(dplyr) library(tidyr) j <- mtcars %>% gather(var, val, disp:qsec) t <- j %>% group_by(var) %>% do(te = t.test(val ~ vs, data = .)) t %>% summarise(p = te$p.value) </code></pre> I´ve tried using <blockquote> mtcars %>% summarise_each_(funs = (t.test(. ~ vs))$p.value, vars = disp:qsec) </blockquote> but it throws an error. Bonus: How can <code>t %>% summarise(p = te$p.value)</code> also include the name of the grouping variable?

After all discussions with @aosmith and @Misha, here is one approach. As @aosmith wrote in his/her comments, You want to do the following. <pre class="prettyprint"><code>mtcars %>% summarise_each(funs(t.test(.[vs == 0], .[vs == 1])$p.value), vars = disp:qsec) # vars1 vars2 vars3 vars4 vars5 #1 2.476526e-06 1.819806e-06 0.01285342 0.0007281397 3.522404e-06 </code></pre> vs is either 0 or 1 (group). If you want to run a t-test between the two groups in a variable (e.g., dips), it seems that you need to subset data as @aosmith suggested. I would like to say thank you for the contribution. What I originally suggested works in another situation, in which you simply compare two columns. Here is sample data and codes. <pre class="prettyprint"><code>foo <- data.frame(country = "Iceland", year = 2014, id = 1:30, A = sample.int(1e5, 30, replace = TRUE), B = sample.int(1e5, 30, replace = TRUE), C = sample.int(1e5, 30, replace = TRUE), stringsAsFactors = FALSE) </code></pre> If you want to run t-tests for the A-C, and B-C combination, the following would be one way. <pre class="prettyprint"><code>foo2 <- foo %>% summarise_each(funs(t.test(., C, pair = TRUE)$p.value), vars = A:B) names(foo2) <- colnames(foo[4:5]) # A B #1 0.2937979 0.5316822 </code></pre>

I like the following solution using the powerful "broom" package: <pre class="prettyprint"><code>library("dplyr") library("broom") your_db %>% group_by(grouping_variable1, grouping_variable2 ...) %>% do(tidy(t.test(variable_u_want_2_test ~ dicothomous_grouping_var, data = .))) </code></pre>

dplyr summarise multiple columns using t.test

Tags:

Is it possible somehow to do a t.test over multiple variables against the same categorical variable without going through a reshaping of the dataset as follows?

data(mtcars)
library(dplyr)
library(tidyr)
j <- mtcars %>% gather(var, val, disp:qsec)
t <- j %>% group_by(var) %>% do(te = t.test(val ~ vs, data = .))

t %>% summarise(p = te$p.value)

I´ve tried using

mtcars %>% summarise_each_(funs = (t.test(. ~ vs))$p.value, vars = disp:qsec)

but it throws an error.

Bonus: How can t %>% summarise(p = te$p.value) also include the name of the grouping variable?

999

asked Oct 07 '14 20:10

Misha

2 Answers

After all discussions with @aosmith and @Misha, here is one approach. As @aosmith wrote in his/her comments, You want to do the following.

mtcars %>%
    summarise_each(funs(t.test(.[vs == 0], .[vs == 1])$p.value), vars = disp:qsec)

#         vars1        vars2      vars3        vars4        vars5
#1 2.476526e-06 1.819806e-06 0.01285342 0.0007281397 3.522404e-06

vs is either 0 or 1 (group). If you want to run a t-test between the two groups in a variable (e.g., dips), it seems that you need to subset data as @aosmith suggested. I would like to say thank you for the contribution.

What I originally suggested works in another situation, in which you simply compare two columns. Here is sample data and codes.

foo <- data.frame(country = "Iceland",
                  year = 2014,
                  id = 1:30,
                  A = sample.int(1e5, 30, replace = TRUE),
                  B = sample.int(1e5, 30, replace = TRUE),
                  C = sample.int(1e5, 30, replace = TRUE),
                  stringsAsFactors = FALSE)

If you want to run t-tests for the A-C, and B-C combination, the following would be one way.

foo2 <- foo %>%
        summarise_each(funs(t.test(., C, pair = TRUE)$p.value), vars = A:B) 

names(foo2) <- colnames(foo[4:5])

#          A         B
#1 0.2937979 0.5316822

111

answered Oct 19 '22 15:10

jazzurro

I like the following solution using the powerful "broom" package:

library("dplyr")
library("broom")

your_db %>%
  group_by(grouping_variable1, grouping_variable2 ...) %>%
  do(tidy(t.test(variable_u_want_2_test ~ dicothomous_grouping_var, data = .)))

answered Oct 19 '22 16:10

carfisma

Related questions
                            
                                how to use reduce with dictionary
                            
                                Laravel php artisan serve to mimic HTTPS
                            
                                Xcode crashes when exporting or submitting to App Store
                            
                                Shared elements animating between fragments
                            
                                OpenSSL: socket: Connection refused connect:errno=111
                            
                                Multiple docker containers as web server on a single IP
                            
                                Shortest possible encoded string with decode possibility (shorten url) using only PHP
                            
                                'class' keyword in variable definition in C++
                            
                                Using autolayout in a tableHeaderView
                            
                                mongodb: UnknownError assertion src/mongo/db/server_options_helpers.cpp:355
                            
                                How to get Angular UI Router to respect "non-routed" URLs
                            
                                Manifest fetch failed (9)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With