I have data like this:
df <- data.frame(id=c(1, 2, 3, 4), A=c(6, NA, NA, 4), B=c(3, 2, NA, NA), C=c(4, 3, 5, NA), D=c(4, 3, 1, 2))
id A B C D
1 1 6 3 4 4
2 2 NA 2 3 3
3 3 NA NA 5 1
4 4 4 NA NA 2
For each row: If the row has non-NA values in column "A", I want that value to be entered into a new column 'E'. If it doesn't, I want to move on to column "B", and that value entered into E. And so on. Thus, the new column would be E = c(6, 2, 5, 4).
I wanted to use the ifelse function, but I am not quite sure how to do this.
library(dplyr)
mutate(df, E = coalesce(A, B, C, D))
# id A B C D E
# 1 1 6 3 4 4 6
# 2 2 NA 2 3 3 2
# 3 3 NA NA 5 1 5
# 4 4 4 NA NA 2 4
coalesce is effectively "return the first non-NA in each vector". It has a SQL equivalent (or it is an equivalent of SQL's COALESCE, actually).
df$E <- apply(df[,-1], 1, function(z) na.omit(z)[1])
df
# id A B C D E
# 1 1 6 3 4 4 6
# 2 2 NA 2 3 3 2
# 3 3 NA NA 5 1 5
# 4 4 4 NA NA 2 4
na.omit removes all of the NA values, and [1] makes sure we always return just the first of them. The advantage of [1] over (say) head(., 1) is that head will return NULL if there are no non-NA elements, whereas .[1] will always return at least an NA (indicating to you that it was the only option).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With