I've 12 data frames, each one contains 6 columns: 5 have the same name, 1 is different. Then when I call rbind()
I get:
Error in match.names(clabs, names(xi)) : names do not match previous names
The column that differs is: "goal1Completions". There are 12 goalsCompletions... they are: "goal1Completions", "goal2Completions", "goal3Completions"... and so on.
The best way I can think of is: renaming every column in every data frame to "GoalsCompletions" and then using "rbind()".
Is there a simpler way?
Look on Google O found this package: "gtools". It has a function called: "smartbind". However, after using smartbind() i want to see the the data frame with "View()", my R session crashes...
My data (an example of the first data frame):
date source medium campaign goal1Completions ad.cost Goal 1 2014-10-01 (direct) (none) (not set) 0 0.0000 Vida 2 2014-10-01 Master email CAFRE 0 0.0000 Vida 3 2014-10-01 apeseg referral (not set) 0 0.0000 Vida
Method 1 : Using plyr package rbind. fill() method in R is an enhancement of the rbind() method in base R, is used to combine data frames with different columns. The column names are number may be different in the input data frames. Missing columns of the corresponding data frames are filled with NA.
Let's find out. In the following example, we will change the column name from 'lastName' to 'surName' for the second data frame. The above code throws an error that the column names must match. So, the column names in both the data frames must be the same if you want to use rbind().
To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.
My favourite use of mapply
:
Example Data
a <- data.frame(a=runif(5), b=runif(5)) > a a b 1 0.8403348 0.1579255 2 0.4759767 0.8182902 3 0.8091875 0.1080651 4 0.9846333 0.7035959 5 0.2153991 0.8744136
and b
b <- data.frame(c=runif(5), d=runif(5)) > b c d 1 0.7604137 0.9753853 2 0.7553924 0.1210260 3 0.7315970 0.6196829 4 0.5619395 0.1120331 5 0.5711995 0.7252631
Solution
Using mapply
:
> mapply(c, a,b) #or as.data.frame(mapply(c, a,b)) for a data.frame a b [1,] 0.8403348 0.1579255 [2,] 0.4759767 0.8182902 [3,] 0.8091875 0.1080651 [4,] 0.9846333 0.7035959 [5,] 0.2153991 0.8744136 [6,] 0.7604137 0.9753853 [7,] 0.7553924 0.1210260 [8,] 0.7315970 0.6196829 [9,] 0.5619395 0.1120331 [10,] 0.5711995 0.7252631
And based on @Marat's comment below:
You can also do data.frame(mapply(c, a, b, SIMPLIFY=FALSE))
or, alternatively, data.frame(Map(c,a,b))
to avoid double data.frame-matrix conversion
You could use rbindlist
which takes different column names. Using @LyzandeR's data
library(data.table) #data.table_1.9.5 rbindlist(list(a,b)) # a b # 1: 0.8403348 0.1579255 # 2: 0.4759767 0.8182902 # 3: 0.8091875 0.1080651 # 4: 0.9846333 0.7035959 # 5: 0.2153991 0.8744136 # 6: 0.7604137 0.9753853 # 7: 0.7553924 0.1210260 # 8: 0.7315970 0.6196829 # 9: 0.5619395 0.1120331 #10: 0.5711995 0.7252631
Based on the object names of the 12 datasets (i.e. 'Goal1_Costo', 'Goal2_Costo',..., 'Goal12_Costo'),
nm1 <- paste(paste0('Goal', 1:12), 'Costo', sep="_") #or using `sprintf` #nm1 <- sprintf('%s%d_%s', 'Goal', 1:12, 'Costo') rbindlist(mget(nm1))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With