cbind 2 dataframes with different number of rows

Tags:

cbind

I have two lists named h and g. They each contain 244 dataframes and they look like the following:

h[[1]]
   year  avg    hr   sal
1  2010  0.300  31   2000
2  2011  0.290  30   4000
3  2012  0.275  14    600
4  2013  0.280  24    800 
5  2014  0.295  18   1000
6  2015  0.330  26   7000
7  2016  0.315  40   9000

g[[1]]
   year  pos  fld     
1  2010  A   0.990
2  2011  B   0.995
3  2013  C   0.970
4  2014  B   0.980
5  2015  D   0.990

I want to cbind these two dataframes. But as you see, they have different number of rows. I want to combine these dataframes so that the rows with the same year will be combined in one row. And I want the empty spaces to be filled with NA. The result I expect looks like this:

   year  avg    hr   sal   pos   fld
1  2010  0.300  31   2000   A   0.990
2  2011  0.290  30   4000   B   0.995
3  2012  0.275  14    600   NA    NA
4  2013  0.280  24    800   C   0.970
5  2014  0.295  18   1000   B   0.980
6  2015  0.330  26   7000   D   0.990
7  2016  0.315  40   9000   NA    NA

Also, I want to repeat this for all the 244 dataframes in each list, h and g. I'd like to make a new list named final which contains the 244 combined dataframes.

How can I do this...? All answers will be greatly appreciated :)

899

asked Nov 03 '16 10:11

min

2 Answers

I think you should instead use merge:

merge(df1, df2, by="year", all = T)

For your data:

df1 = data.frame(matrix(0, 7, 4))
names(df1) = c("year", "avg", "hr", "sal")
df1$year = 2010:2016
df1$avg = c(.3, .29, .275, .280, .295, .33, .315)
df1$hr = c(31, 30, 14, 24, 18, 26, 40)
df1$sal = c(2000, 4000, 600, 800, 1000, 7000, 9000)
df2 = data.frame(matrix(0, 5, 3))
names(df2) = c("year", "pos", "fld")
df2$year = c(2010, 2011, 2013, 2014, 2015)
df2$pos = c('A', 'B', 'C', 'B', 'D')
df2$fld = c(.99,.995,.97,.98,.99)

cbind is meant to column-bind two dataframes that are in all sense compatible. But what you aim to do is actual merge, where you want the elements from the two data frames not be discarded, and for missing values you get NA instead.

191

answered Oct 20 '22 07:10

A.Yazdiha

We can use Map with cbind.fill (from rowr) to cbind the corresponding 'data.frame' from 'h' and 'g'.

library(rowr)
Map(cbind.fill, h, g, MoreArgs = list(fill=NA))

Update

Based on the expected output showed, it seems like the OP wanted a merge instead of cbind

f1 <- function(...) merge(..., all = TRUE, by = 'year')
Map(f1, h, g)
#[[1]]
#  year   avg hr  sal  pos   fld
#1 2010 0.300 31 2000    A 0.990
#2 2011 0.290 30 4000    B 0.995
#3 2012 0.275 14  600 <NA>    NA
#4 2013 0.280 24  800    C 0.970
#5 2014 0.295 18 1000    B 0.980
#6 2015 0.330 26 7000    D 0.990
#7 2016 0.315 40 9000 <NA>    NA

Or as @Colonel Beauvel mentioned, this can be made compact

Map(merge, h, g, by='year', all=TRUE)

data

h <- list(structure(list(year = 2010:2016, avg = c(0.3, 0.29, 0.275, 
0.28, 0.295, 0.33, 0.315), hr = c(31L, 30L, 14L, 24L, 18L, 26L, 
 40L), sal = c(2000L, 4000L, 600L, 800L, 1000L, 7000L, 9000L)), .Names = c("year", 
 "avg", "hr", "sal"), class = "data.frame", row.names = c("1", 
 "2", "3", "4", "5", "6", "7")))

g <- list(structure(list(year = c(2010L, 2011L, 2013L, 2014L, 2015L
), pos = c("A", "B", "C", "B", "D"), fld = c(0.99, 0.995, 0.97, 
0.98, 0.99)), .Names = c("year", "pos", "fld"), class = "data.frame",
row.names = c("1", 
"2", "3", "4", "5")))

answered Oct 20 '22 07:10

akrun

Related questions
                            
                                Count items greater than a value in pandas groupby
                            
                                What is the best way to listen for component resize events within an Angular2 component?
                            
                                Avoid a "Google Maps" map embedded into a web page to store cookies
                            
                                How can I color part of Slack incoming-webhook messages?
                            
                                How do I debug .NET 4.6 framework source code in Visual Studio 2017?
                            
                                Can a HTML telephone link accept spaces in the value?
                            
                                Numpy dot too clever about symmetric multiplications
                            
                                Why is Arrays.binarySearch not improving the performance compared to walking the array?
                            
                                How to log - the 12 factor application way
                            
                                malformed query, expected END_OBJECT but found FIELD_NAME error in Kibana (Elastic Search)
                            
                                What is the proper way to weight decay for Adam Optimizer
                            
                                What is the difference between Component, Behaviour and MonoBehaviour? And why these are separated?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With