Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get mean functions to work when I use piping?

Tags:

r

dplyr

This is probably a simple question, but I'm having trouble getting the mean function to work using dplyr.

Using the mtcars dataset as an example, if I type:

data(mtcars)

mtcars %>%
select (mpg) %>%
mean()

I get the "Warning message: In mean.default(.) : argument is not numeric or logical: returning NA" error message.

For some reason though if I repeat the same code but just ask for a "summary", or "range" or several other statistical calculations, they work fine:

data(mtcars)

mtcars %>%
select (mpg) %>%
summary()

Similarly, if I run the mean function in base R notation, that works fine too:

mean(mtcars$mpg)

Can anyone point out what I've done wrong?

like image 200
Jeremy Nel Avatar asked Sep 03 '25 08:09

Jeremy Nel


2 Answers

Use pull to pull out the vector.

mtcars %>%
  pull(mpg) %>%
  mean()
# [1] 20.09062

Or use pluck from the purrr package.

mtcars %>%
  purrr::pluck("mpg") %>%
  mean()
# [1] 20.09062

Or summarize first and then pull out the mean.

mtcars %>%
  summarize(mean = mean(mpg)) %>%
  pull(mean)
# [1] 20.09062
like image 119
www Avatar answered Sep 04 '25 23:09

www


In dplyr, you can use summarise() whenever you're not changing your original dataframe (reordering it, filtering it, adding to it, etc), but instead are creating a new dataframe that has summary statistics for the first dataframe.

enter image description here

mtcars %>%
  summarise(mean_mpg = mean(mpg))

gives the output:

  mean_mpg
1 20.09062

PS. If you're learning dplyr, learning these five verbs will take you a long way: select(), filter(), group_by(), summarise(), arrange().

like image 42
Jeremy K. Avatar answered Sep 05 '25 00:09

Jeremy K.