I want do do a mean of my dataframe with the dplyr package for all my colums.
n = c(NA, 3, 5)
s = c("aa", "bb", "cc")
b = c(3, 0, 5)
df = data.frame(n, s, b)
Here I want my function to get mean = 4 the n and b columns
I tried mean(df$n[df$n>0])
buts it's not easy for a large dataframe.
I want something like df %>% summarise_each(funs(mean))
...
Thanks
Replace 0 with NA in an R DataframeUse df[df==0] to check if the value of a dataframe column is 0, if it is 0 you can assign the value NA .
Group By Multiple Columns in R using dplyrUse group_by() function in R to group the rows in DataFrame by multiple columns (two or more), to use this function, you have to install dplyr first using install. packages('dplyr') and load it using library(dplyr) . All functions in dplyr package take data.
We can calculate the sum of multiple columns by using rowSums() and c() Function. we simply have to pass the name of the columns.
The rowMeans() function in R can be used to calculate the mean of several rows of a matrix or data frame in R.
If you don't want 0s it's probably that you consider them as NAs, so let's be explicit about it, then summarize numeric columns with na.rm = TRUE
:
library(dplyr)
df[df==0] <- NA
summarize_if(df, is.numeric, mean, na.rm = TRUE)
# n b
# 1 4 4
As a one liner:
summarize_if(`[<-`(df, df==0, value= NA), is.numeric, mean, na.rm = TRUE)
and in base R
(result as a named numeric vector)
sapply(`[<-`(df, df==0, value= NA)[sapply(df, is.numeric)], mean, na.rm=TRUE)
Cf elegant David Answer :
df %>% summarise_each(funs(mean(.[!is.na(.) & . != 0])), -s)
Or
df %>% summarise_each(funs(mean(.[. != 0], na.rm = TRUE)), -s)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With