Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge two lists of dataframes

I have two big lists of dataframes that I want to merge. Here is a sample of the data.

list1 = list(data.frame(Wvlgth = c(337, 337.5, 338, 338.5, 339, 339.5),
            Global = c(".9923+00",".01245+00", ".0005+00", ".33421E+00", ".74361+00", ".129342+00"),
            group = c(0,0,0,0,0,0)),
            data.frame(Wvlgth = c(337, 337.5, 338, 338.5, 339, 339.5),
            Global = c(".1284+00",".0098+00", ".7853+00", ".2311+00", ".1211+00", ".75345+00"),
            group = c(1,1,1,1,1,1)))

list2 = list(data.frame(Wvlgth = c(337, 337.5, 338, 339),
                time = c("13.445","13.445", "13.445", "13.445"),
                IRD = c(.01324, .34565, .92395, .67489)),
                data.frame(Wvlgth = c(337, 337.5, 338, 339),
                time = c("13.45361","13.45361", "13.45361", "13.45361"),
                IRD = c(.20981, .98703, .54092, .38567)))

I want to merge each dataframe of list1 with each dataframe of list2, by "Wvlgth", to get something like this:

Wvlgth    time      IRD        Global      group
337       13.445    0.01324    .9923+00        0
337.5     13.445    0.34565    .01245+00       0
338       13.445    0.92395    .0005+00        0
339       13.445    0.67489    .74361+00       0
337       13.45361  0.20981    .1284+00        1
337.5     13.45361  0.98703    .0098+00        1
338       13.45361  0.54092    .7853+00        1
338.5     13.45361  0.38567    .2311+00        1

I want to use an inner join because the dataframes of list1 don't have the same number of rows as the dataframes of list2.

I tried the accepted answer using dplyr from this question, but it ended up merging them in a weird way, I'm not quite sure what happened. It looks like it merged them horizontally instead of vertically...?

> c(list1, list2) %>%
      Reduce(function(dtf1, dtf2) inner_join(dtf1, dtf2, by="Wvlgth"), .)

  Wvlgth  Global.x group.x Global.y group.y time.x   IRD.x   time.y
1  337.0  .9923+00       0 .1284+00       1 13.445 0.01324 13.45361
2  337.5 .01245+00       0 .0098+00       1 13.445 0.34565 13.45361
3  338.0  .0005+00       0 .7853+00       1 13.445 0.92395 13.45361
4  339.0 .74361+00       0 .1211+00       1 13.445 0.67489 13.45361
    IRD.y
1 0.20981
2 0.98703
3 0.54092
4 0.38567
like image 914
ale19 Avatar asked Jun 22 '17 15:06

ale19


People also ask

How do I merge a list of DataFrames in Python?

To join a list of DataFrames, say dfs , use the pandas. concat(dfs) function that merges an arbitrary number of DataFrames to a single one.

How do I create a data frame from two lists?

We use data. frame() and unlist() functions to create a dataframe using lists. unlist() function is used to covert list to vector so that we can use it as "df" argument in data. frame() function.

How do I merge two DataFrames without a key?

One way is to use pd. DataFrame. join after filtering out null values.


2 Answers

You could loop through both lists simultaneously and join each element using map2 from package purrr. To return a single data.frame rather than a list of separate, joined data.frames you can use map2_df.

library(purrr)
library(dplyr)

map2_df(list1, list2, inner_join, by = "Wvlgth")

  Wvlgth    Global group     time     IRD
1  337.0  .9923+00     0   13.445 0.01324
2  337.5 .01245+00     0   13.445 0.34565
3  338.0  .0005+00     0   13.445 0.92395
4  339.0 .74361+00     0   13.445 0.67489
5  337.0  .1284+00     1 13.45361 0.20981
6  337.5  .0098+00     1 13.45361 0.98703
7  338.0  .7853+00     1 13.45361 0.54092
8  339.0  .1211+00     1 13.45361 0.38567
like image 78
aosmith Avatar answered Oct 16 '22 15:10

aosmith


In base R, you can feed the output of Map to do.call / rbind.

do.call(rbind, Map(merge, list1, list2, by="Wvlgth"))
  Wvlgth    Global group     time     IRD
1  337.0  .9923+00     0   13.445 0.01324
2  337.5 .01245+00     0   13.445 0.34565
3  338.0  .0005+00     0   13.445 0.92395
4  339.0 .74361+00     0   13.445 0.67489
5  337.0  .1284+00     1 13.45361 0.20981
6  337.5  .0098+00     1 13.45361 0.98703
7  338.0  .7853+00     1 13.45361 0.54092
8  339.0  .1211+00     1 13.45361 0.38567

Map merges the corresponding data.frames in the two lists and returns a single list of data.frames. These data.frames are then append with do.call and rbind.

If the data sets are especially large, you can perform the appending with rbindlist from data.table:

library(data.table)
rbindlist(Map(merge, list1, list2, by="Wvlgth"))

which returns a data.table object.

like image 40
lmo Avatar answered Oct 16 '22 16:10

lmo