How to use a statistic function and subsetting data simultaneously in R?

Question

I have data that looks like this (dat)

region muscle    protein
head   cerebrum  78
head   cerebrum  56
head   petiole   1
head   petiole   2
tail   pectoral  3
tail   pectoral  4

I want to take the mean of protein values of cerebrum. I tried to look up different ways to subset data here and here. But there does not seem a straightforward way of doing it. Right now, I'm doing this:

datcerebrum <- dat[which(dat$muscle == "cerebrum"),]
mean(datcerebrum$protein)

I try to condense this one line :

mean(dat[which(dat$muscle == "cerebrum"),])

But it throws out a NA with a warning that argument is not numeric or logical. Is there an easy way to achieve this?

akrun · Accepted Answer

We can use aggregate from base R

aggregate(protein ~muscle, dat, mean)
#   muscle protein
#1 cerebrum    67.0
#2 pectoral     3.5
#3  petiole     1.5

Mike Stanley · Answer

I'd do this with the tidyverse package dplyr:

library(readr)
library(dplyr)
fwf <- "head   cerebrum  78
head   cerebrum  56
head   petiole   1
head   petiole   2
tail   pectoral  3
tail   pectoral  4"
dat <- read_fwf(fwf, fwf_empty(fwf, col_names = c("region", "muscle", "protein")))
# The above code is just to create your data frame - please provide reproducible data!

dat %>% filter(muscle == "cerebrum") %>% summarise(m = mean(protein))
#> # A tibble: 1 x 1
#>       m
#>   <dbl>
#> 1    67

You could even do it for every muscle at once:

dat %>% group_by(muscle) %>% summarise(m = mean(protein))
#> # A tibble: 3 x 2
#>     muscle     m
#>      <chr> <dbl>
#> 1 cerebrum  67.0
#> 2 pectoral   3.5
#> 3  petiole   1.5

How to use a statistic function and subsetting data simultaneously in R?

Tags:

r

subset

mean

Ash

2 Answers

akrun

Mike Stanley

Recent Activity

Donate For Us

How to use a statistic function and subsetting data simultaneously in R?

Tags:

r

subset

mean

Ash

2 Answers

akrun

Mike Stanley

Related questions

Recent Activity

Donate For Us