Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge 2 columns in dataframe by prioritize one of them

Tags:

merge

r

Assuming that I have the following data.frame:

Value1    Value2
   'a'      <NA>
  <NA>       'b'
  <NA>       'c'
   'd'       'e'
   'f'       'g'
  <NA>      <NA>

How can I compine these columns in one by saying that the first column has the 'priority', meaning that if both columns have value chouse from the first one. SO the result should be:

Value3
    'a'
    'b'
    'c'
    'd'
    'f'
   <NA>

edit: where <NA> are not available values.

like image 798
Mpizos Dimitris Avatar asked Oct 23 '25 18:10

Mpizos Dimitris


2 Answers

Here's a simple method using max.col (I'm assuming these are real NAs)

df[cbind(1:nrow(df), max.col(!is.na(df), ties.method = "first"))]
# [1] "a" "b" "c" "d" "f" NA 

If these aren't real NAs, you could do

df[cbind(1:nrow(df), max.col(df != "<NA>", ties.method = "first"))]

Or alternatively convert them to NAs using is.na(df) <- df == "<NA>" and then try the first solution.

like image 146
David Arenburg Avatar answered Oct 26 '25 08:10

David Arenburg


We can try data.table. We convert the 'data.frame' to 'data.table' (setDT(df1)). If there is NA value in 'Value1', we assign the 'Value2' elements to 'Value1', remove the 'Value2' column and change the column name to 'Value3'.

library(data.table)
setnames(setDT(df1)[is.na(Value1), Value1:= Value2
              ][, Value2:= NULL], 'Value3')
df1
#   Value3
#1:      a
#2:      b
#3:      c
#4:      d
#5:      f
#6:     NA

Another popular way would be ifelse

with(df1, ifelse(is.na(Value1), Value2, Value1))
#[1] "a" "b" "c" "d" "f" NA 
like image 34
akrun Avatar answered Oct 26 '25 08:10

akrun