How to run ANOVA on a wide format data.frame?

Tags:

I've been taught to run an ANOVA with the formula: aov(dependent variable~independent variable, dataset)

but I am struggling with how to run an ANOVA for a particular dataset because it is broken up into three columns that each contain a value. The three columns are designated newborn, adolescent and adult (which is hamster age) and the values within each column represent blood pressure values. I need to run a test to determine if there is a relationship between blood pressure and age.

This is what the data looks like in R:

> hamster
   Newborn adolescent adult
1      108        110   105
2      110        105   100
3       90        100    95
4       80         90    85
5      100        102    97
6      120        110   105
7      125        105   100
8      130        115   110
9      120        100    95
10     130        120   115
11     145        130   125
12     150        125   120
13     130        135   130
14     155        130   125
15     140        120   115

Confused because the dependent variable are those values ^ within each column

249

asked Apr 29 '18 23:04

Victoria Fletcher

2 Answers

The first step is to rearrange your data so it's in a "long" format instead of a "wide" format. This can be done in base R using the reshape function, but it's much easier to use the gather function in the tidyr package:

library(tidyr)
result <- hampster %>%
  gather(age, bp) %>%
  aov(bp ~ age, .)

Using tidyr also gives us the pipe operator (%>%), which let's you chain commands together in a pretty way. By default, it works by taking the result of the previous function and inserting it as the first argument of the next function. In your aov function, we overrode this using the . operator to explicitly put the data set resulting from the gather function in as the 2nd argument.

113

answered Sep 22 '22 06:09

Melissa Key

R has a useful function called stack to convert your data format into the one needed for ANOVA.

aov(values ~ ind, stack(hamster))

# Call:
#
# aov(formula = values ~ ind, data = stack(hamster))
#
# Terms:
#                       ind Residuals
# Sum of Squares   1525.378 11429.867
# Deg. of Freedom         2        42
#
# Residual standard error: 16.49666
# Estimated effects may be unbalanced

answered Sep 22 '22 06:09

Karolis Koncevičius

Related questions
                            
                                Split r chunk header across lines in knitr
                            
                                Collapse absolutePanel in shiny?
                            
                                reqExecutions IBrokers package
                            
                                Stemming words using tm package in R does not work properly?
                            
                                R smooth.spline(): smoothing spline is not smooth but overfitting my data
                            
                                How can I interleave rows from 2 data frames together?
                            
                                Importing csv file with line breaks to R or Python Pandas
                            
                                R Shiny date slider animation by month (currently by day)
                            
                                How to print numbers divisible by 7
                            
                                How to pass multiple column names as input to group_by in dplyr [duplicate]
                            
                                Increment by one to each duplicate value
                            
                                Python version of R's ifelse statement
                            
                                Why is `speedglm` slower than `glm`?
                            
                                R data table: update join
                            
                                cbind with partially nested list
                            
                                How to make the table header bold with Knitr (for pdf output)?
                            
                                Levels function returning NULL
                            
                                How to save frames of gif created using gganimate package
                            
                                Mutating dummy variables in dplyr
                            
                                TwitteR r package: /usr/lib/x86_64-linux-gnu/libcurl.so.4: version `CURL_OPENSSL_3' not found

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to run ANOVA on a wide format data.frame?

Tags:

dataframe

r

statistics

reshape

anova

Victoria Fletcher

People also ask

2 Answers

Melissa Key

Karolis Koncevičius

Recent Activity

Donate For Us