Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Coalesce two string columns with alternating missing values to one

Tags:

r

missing-data

na

I have a data frame with two columns "a" and "b" with alternating missing values (NA)

a      b
dog    <NA>
mouse  <NA>
<NA>   cat
bird   <NA>

I want to "merge" / combine them to a new column c that looks like this, i.e. the non-NA element in each row is selected:

c
dog
mouse
cat
bird

I tried merge and join, but neither worked as I wanted. Maybe because I do not have an id with which to merge? For integers I would just circumvent this and add both columns, but how in my case?

like image 231
ben_aaron Avatar asked Jan 08 '15 22:01

ben_aaron


2 Answers

I wrote a coalesce() function for this type of task which works much like the SQL coalesce function. You would use it like

dd<-read.table(text="a      b
dog    NA
mouse  NA
NA   cat
bird   NA", header=T)

dd$c <- with(dd, coalesce(a,b))
dd
#       a    b     c
# 1   dog <NA>   dog
# 2 mouse <NA> mouse
# 3  <NA>  cat   cat
# 4  bird <NA>  bird
like image 172
MrFlick Avatar answered Nov 15 '22 15:11

MrFlick


Another option is to use which with arr.ind=TRUE

indx <- which(!is.na(df), arr.ind=TRUE)
df$c <-  df[indx][order(indx[,1])]
df
#    a    b     c
#1   dog <NA>   dog
#2 mouse <NA> mouse
#3  <NA>  cat   cat
#4  bird <NA>  bird

Or

df$c <- df[cbind(1:nrow(df),max.col(!is.na(df)))]
like image 35
akrun Avatar answered Nov 15 '22 16:11

akrun