Select first non-NA value by row [duplicate]

Question

I have data like this:

df <- data.frame(id=c(1, 2, 3, 4), A=c(6, NA, NA, 4), B=c(3, 2, NA, NA), C=c(4, 3, 5, NA), D=c(4, 3, 1, 2))

   id A  B  C D
1  1  6  3  4 4
2  2 NA  2  3 3
3  3 NA NA  5 1
4  4  4 NA NA 2

For each row: If the row has non-NA values in column "A", I want that value to be entered into a new column 'E'. If it doesn't, I want to move on to column "B", and that value entered into E. And so on. Thus, the new column would be E = c(6, 2, 5, 4).

I wanted to use the ifelse function, but I am not quite sure how to do this.

r2evans · Accepted Answer

tidyverse

library(dplyr)
mutate(df, E = coalesce(A, B, C, D))
#   id  A  B  C D E
# 1  1  6  3  4 4 6
# 2  2 NA  2  3 3 2
# 3  3 NA NA  5 1 5
# 4  4  4 NA NA 2 4

coalesce is effectively "return the first non-NA in each vector". It has a SQL equivalent (or it is an equivalent of SQL's COALESCE, actually).

base R

df$E <- apply(df[,-1], 1, function(z) na.omit(z)[1])
df
#   id  A  B  C D E
# 1  1  6  3  4 4 6
# 2  2 NA  2  3 3 2
# 3  3 NA NA  5 1 5
# 4  4  4 NA NA 2 4

na.omit removes all of the NA values, and [1] makes sure we always return just the first of them. The advantage of [1] over (say) head(., 1) is that head will return NULL if there are no non-NA elements, whereas .[1] will always return at least an NA (indicating to you that it was the only option).

Select first non-NA value by row [duplicate]

Tags:

r

multiple-columns

missing-data

na

krazyglue5

1 Answers

tidyverse

base R

r2evans

Recent Activity

Donate For Us

Select first non-NA value by row [duplicate]

Tags:

r

multiple-columns

missing-data

na

krazyglue5

1 Answers

tidyverse

base R

r2evans

Related questions

Recent Activity

Donate For Us