Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How best to collapse two factors with NAs into one variable

Tags:

r

I have lots of sets of variables like this:

   Var1    Var2
"Asian"      NA
     NA  "Black"
"White"      NA

I would like to conveniently get them into this form:

   Race
"Asian"
"Black"
"White"

I have been trying something like:

Race <- ifelse(is.na(Var1), Var2, Var1)

But this converts the values into numbers for the levels, and the numbers don't match up (e.g., that yields 1, 1, 2). Is there a convenient way to do this (ideally with short, self-explanatory code)? (You can get out of this with as.character, but there has to be a better way.)

like image 427
gung - Reinstate Monica Avatar asked Jan 12 '15 01:01

gung - Reinstate Monica


People also ask

How do I combine two variables in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

How do you store the same value in two variables?

You can assign the same value to multiple variables by using = consecutively. This is useful, for example, when initializing multiple variables to the same value. It is also possible to assign another value into one after assigning the same value.


2 Answers

With an intermediate conversion via as.character:
Assuming this is your data:

dat <- data.frame(Var1=c("Asian",NA,"White"),Var2=c(NA,"Black",NA))

do.call(pmax,c(lapply(dat,as.character),na.rm=TRUE))
#[1] "Asian" "Black" "White"

If you need to work on a particular subset you can do:

do.call(pmax,c(lapply(dat[c("Var1","Var2")],as.character),na.rm=TRUE))

An alternative not requiring as.character would be:

dat[cbind(1:nrow(dat),max.col(!is.na(dat)))]
#[1] "Asian" "Black" "White"
like image 159
thelatemail Avatar answered Oct 20 '22 00:10

thelatemail


What about this solution?:

ind <- apply(df, 1, function(x) which(!is.na(x)))
df[cbind(seq_along(ind), ind)]
[1] "Asian" "Black" "White"
like image 41
DatamineR Avatar answered Oct 19 '22 23:10

DatamineR