Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum of two Columns of Data Frame with NA Values

Tags:

r

I have a data frame with some NA values. I need the sum of two of the columns. If a value is NA, I need to treat it as zero.

a  b c d
1  2 3 4
5 NA 7 8

Column e should be the sum of b and c:

e
5
7

I have tried a lot of things, and done two dozen searches with no luck. It seems like a simple problem. Any help would be appreciated!

like image 800
StatDance Avatar asked Jul 16 '15 17:07

StatDance


People also ask

How do I sum two columns in R in NA?

To find the sum of non-missing values in an R data frame column, we can simply use sum function and set the na. rm to TRUE. For example, if we have a data frame called df that contains a column say x which has some missing values then the sum of the non-missing values can be found by using the command sum(df$x,na.

How do you sum two columns in a data frame?

Pandas: Sum values in two different columns using loc[] as assign as a new column. We selected the columns 'Jan' & 'Feb' using loc[] and got a mini dataframe which contains only these two columns. Then called the sum() with axis=1, which added the values in all the columns and returned a Series object.

How do I add columns to NA?

There is a formula can help you quickly sum up the column ignore #N/A. Select a blank cell, C3 for instance, and type this formula =SUMIF(A1:A14,"<>#N/A"), press Enter key to get the result.


5 Answers

dat$e <- rowSums(dat[,c("b", "c")], na.rm=TRUE)
dat
#   a  b c d e
# 1 1  2 3 4 5
# 2 5 NA 7 8 7
like image 164
Rorschach Avatar answered Sep 28 '22 13:09

Rorschach


dplyr solution, taken from here:

library(dplyr)
dat %>% 
    rowwise() %>% 
    mutate(e = sum(b, c, na.rm = TRUE))
like image 45
David Rubinger Avatar answered Sep 27 '22 13:09

David Rubinger


Here is another solution, with concatenated ifelse():

 dat$e <- ifelse(is.na(dat$b) & is.na(dat$c), dat$e <-0, ifelse(is.na(dat$b), dat$e <- 0 + dat$c, dat$b + dat$c))
 #  a  b c d e
 #1 1  2 3 4 5
 #2 5 NA 7 8 7

Edit, here is another solution that uses with as suggested by @kasterma in the comments, this is much more readable and straightforward:

 dat$e <- with(dat, ifelse(is.na(b) & is.na(c ), 0, ifelse(is.na(b), 0 + c, b + c)))
like image 32
erasmortg Avatar answered Sep 28 '22 13:09

erasmortg


I hope that it may help you

Some cases you have a few columns that are not numeric. This approach will serve you both. Note that: c_across() for dplyr version 1.0.0 and later

df <- data.frame(
  TEXT = c("text1", "text2"), a = c(1,5), b = c(2, NA), c = c(3,7), d = c(4,8))

df2 <- df %>% 
  rowwise() %>% 
  mutate(e = sum(c_across(a:d), na.rm = TRUE))
# A tibble: 2 x 6
# Rowwise: 
# TEXT        a     b     c     d     e
# <chr>     <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 text1     1     2     3     4    10
# 2 text2     5    NA     7     8    20
like image 23
Tho Vu Avatar answered Sep 25 '22 13:09

Tho Vu


if you want to keep NA if both columns has it you can use:

Data, sample:

dt <- data.table(x = sample(c(NA, 1, 2, 3), 100, replace = T), y = sample(c(NA, 1, 2, 3), 100, replace = T))

Solution:

dt[, z := ifelse(is.na(x) & is.na(y), NA_real_, rowSums(.SD, na.rm = T)), .SDcols = c("x", "y")]

(the data.table way)

like image 29
K. Peltzer Avatar answered Sep 25 '22 13:09

K. Peltzer