Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simplest way to get rbind to ignore column names

Tags:

r

rbind

This came up just in an answer to another question here. When you rbind two data frames, it matches columns by name rather than index, which can lead to unexpected behavior:

> df<-data.frame(x=1:2,y=3:4) > df   x y 1 1 3 2 2 4 > rbind(df,df[,2:1])   x y 1 1 3 2 2 4 3 1 3 4 2 4 

Of course, there are workarounds. For example:

rbind(df,rename(df[,2:1],names(df))) data.frame(rbind(as.matrix(df),as.matrix(df[,2:1]))) 

On edit: rename from the plyr package doesn't actually work this way (although I thought I had it working when I originally wrote this...). The way to do this by renaming is to use SimonO101's solution:

rbind(df,setNames(df[,2:1],names(df))) 

Also, maybe surprisingly,

data.frame(rbindlist(list(df,df[,2:1]))) 

works by index (and if we don't mind a data table, then it's pretty concise), so this is a difference between do.call(rbind).

The question is, what is the most concise way to rbind two data frames where the names don't match? I know this seems trivial, but this kind of thing can end up cluttering code. And I don't want to have to write a new function called rbindByIndex. Ideally it would be something like rbind(df,df[,2:1],byIndex=T).

like image 443
mrip Avatar asked Oct 10 '13 13:10

mrip


People also ask

How do I ignore column names with Rbind?

Example: Ignore Column Names when Using rbind() Function For this, we have to use the setNames and names functions to temporarily rename the column names of our second data frame. After running the previous R code the combined data frame shown in Table 3 has been created.

How do I Rbind data frames with different columns in R?

Method 1 : Using plyr package rbind. fill() method in R is an enhancement of the rbind() method in base R, is used to combine data frames with different columns. The column names are number may be different in the input data frames. Missing columns of the corresponding data frames are filled with NA.

What is the difference between Cbind and Rbind?

cbind() and rbind() both create matrices by combining several vectors of the same length. cbind() combines vectors as columns, while rbind() combines them as rows. Let's use these functions to create a matrix with the numbers 1 through 30.

Does Rbind work if columns are in different order?

0), rbind has the capacity to to join two data sets with the same name columns even if they are in different order.


2 Answers

You might find setNames handy here...

rbind(df, setNames(rev(df), names(df))) #  x y #1 1 3 #2 2 4 #3 3 1 #4 4 2 

I suspect your real use-case is somewhat more complex. You can of course reorder columns in the first argument of setNames as you wish, just use names(df) in the second argument, so that the names of the reordered columns match the original.

like image 166
Simon O'Hanlon Avatar answered Sep 19 '22 13:09

Simon O'Hanlon


This seems pretty easy:

mapply(c,df,df[,2:1])      x y [1,] 1 3 [2,] 2 4 [3,] 3 1 [4,] 4 2 

For this simple case, though, you have to turn it back into a dataframe (because mapply simplifies it to a matrix):

as.data.frame(mapply(c,df,df[,2:1]))   x y 1 1 3 2 2 4 3 3 1 4 4 2 

Important note 1: There appears to be a downside of type coercion when your dataframe contains vectors of different types:

df<-data.frame(x=1:2,y=3:4,z=c('a','b')) mapply(c,df,df[,c(2:1,3)])      x y z [1,] 1 3 2 [2,] 2 4 1 [3,] 3 1 2 [4,] 4 2 1 

Important note 2: It also is terrible if you have factors.

df<-data.frame(x=factor(1:2),y=factor(3:4)) mapply(c,df[,1:2],df[,2:1])      x y [1,] 1 1 [2,] 2 2 [3,] 1 1 [4,] 2 2 

So, as long as you have all numeric data, it's okay.

like image 37
Thomas Avatar answered Sep 23 '22 13:09

Thomas