I have two big lists of dataframes that I want to merge. Here is a sample of the data.
list1 = list(data.frame(Wvlgth = c(337, 337.5, 338, 338.5, 339, 339.5),
Global = c(".9923+00",".01245+00", ".0005+00", ".33421E+00", ".74361+00", ".129342+00"),
group = c(0,0,0,0,0,0)),
data.frame(Wvlgth = c(337, 337.5, 338, 338.5, 339, 339.5),
Global = c(".1284+00",".0098+00", ".7853+00", ".2311+00", ".1211+00", ".75345+00"),
group = c(1,1,1,1,1,1)))
list2 = list(data.frame(Wvlgth = c(337, 337.5, 338, 339),
time = c("13.445","13.445", "13.445", "13.445"),
IRD = c(.01324, .34565, .92395, .67489)),
data.frame(Wvlgth = c(337, 337.5, 338, 339),
time = c("13.45361","13.45361", "13.45361", "13.45361"),
IRD = c(.20981, .98703, .54092, .38567)))
I want to merge each dataframe of list1 with each dataframe of list2, by "Wvlgth", to get something like this:
Wvlgth time IRD Global group
337 13.445 0.01324 .9923+00 0
337.5 13.445 0.34565 .01245+00 0
338 13.445 0.92395 .0005+00 0
339 13.445 0.67489 .74361+00 0
337 13.45361 0.20981 .1284+00 1
337.5 13.45361 0.98703 .0098+00 1
338 13.45361 0.54092 .7853+00 1
338.5 13.45361 0.38567 .2311+00 1
I want to use an inner join because the dataframes of list1 don't have the same number of rows as the dataframes of list2.
I tried the accepted answer using dplyr
from this question, but it ended up merging them in a weird way, I'm not quite sure what happened. It looks like it merged them horizontally instead of vertically...?
> c(list1, list2) %>%
Reduce(function(dtf1, dtf2) inner_join(dtf1, dtf2, by="Wvlgth"), .)
Wvlgth Global.x group.x Global.y group.y time.x IRD.x time.y
1 337.0 .9923+00 0 .1284+00 1 13.445 0.01324 13.45361
2 337.5 .01245+00 0 .0098+00 1 13.445 0.34565 13.45361
3 338.0 .0005+00 0 .7853+00 1 13.445 0.92395 13.45361
4 339.0 .74361+00 0 .1211+00 1 13.445 0.67489 13.45361
IRD.y
1 0.20981
2 0.98703
3 0.54092
4 0.38567
To join a list of DataFrames, say dfs , use the pandas. concat(dfs) function that merges an arbitrary number of DataFrames to a single one.
We use data. frame() and unlist() functions to create a dataframe using lists. unlist() function is used to covert list to vector so that we can use it as "df" argument in data. frame() function.
One way is to use pd. DataFrame. join after filtering out null values.
You could loop through both lists simultaneously and join each element using map2
from package purrr. To return a single data.frame rather than a list of separate, joined data.frames you can use map2_df
.
library(purrr)
library(dplyr)
map2_df(list1, list2, inner_join, by = "Wvlgth")
Wvlgth Global group time IRD
1 337.0 .9923+00 0 13.445 0.01324
2 337.5 .01245+00 0 13.445 0.34565
3 338.0 .0005+00 0 13.445 0.92395
4 339.0 .74361+00 0 13.445 0.67489
5 337.0 .1284+00 1 13.45361 0.20981
6 337.5 .0098+00 1 13.45361 0.98703
7 338.0 .7853+00 1 13.45361 0.54092
8 339.0 .1211+00 1 13.45361 0.38567
In base R, you can feed the output of Map
to do.call
/ rbind
.
do.call(rbind, Map(merge, list1, list2, by="Wvlgth"))
Wvlgth Global group time IRD
1 337.0 .9923+00 0 13.445 0.01324
2 337.5 .01245+00 0 13.445 0.34565
3 338.0 .0005+00 0 13.445 0.92395
4 339.0 .74361+00 0 13.445 0.67489
5 337.0 .1284+00 1 13.45361 0.20981
6 337.5 .0098+00 1 13.45361 0.98703
7 338.0 .7853+00 1 13.45361 0.54092
8 339.0 .1211+00 1 13.45361 0.38567
Map
merges the corresponding data.frames in the two lists and returns a single list of data.frames. These data.frames are then append with do.call
and rbind
.
If the data sets are especially large, you can perform the appending with rbindlist
from data.table
:
library(data.table)
rbindlist(Map(merge, list1, list2, by="Wvlgth"))
which returns a data.table object.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With