is there an elegant way to handle NA as 0 (na.rm = TRUE) in dplyr?
data <- data.frame(a=c(1,2,3,4), b=c(4,NA,5,6), c=c(7,8,9,NA)) data %>% mutate(sum = a + b + c) a b c sum 1 4 7 12 2 NA 8 NA 3 5 9 17 4 6 NA NA
but I like to get
a b c sum 1 4 7 12 2 NA 8 10 3 5 9 17 4 6 NA 10
even if I know that this is not the desired result in many other cases
To find the sum of non-missing values in an R data frame column, we can simply use sum function and set the na. rm to TRUE. For example, if we have a data frame called df that contains a column say x which has some missing values then the sum of the non-missing values can be found by using the command sum(df$x,na.
You could use this:
library(dplyr) data %>% #rowwise will make sure the sum operation will occur on each row rowwise() %>% #then a simple sum(..., na.rm=TRUE) is enough to result in what you need mutate(sum = sum(a,b,c, na.rm=TRUE))
Output:
Source: local data frame [4 x 4] Groups: <by row> a b c sum (dbl) (dbl) (dbl) (dbl) 1 1 4 7 12 2 2 NA 8 10 3 3 5 9 17 4 4 6 NA 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With