In R, how can I inner_join
multiple tbls
or data.frame
s effectively?
For example:
devtools::install_github("rstudio/EDAWR")
library(EDAWR)
library(dplyr)
data(songs)
data(artists)
test <- songs
colnames(test) <- c("song2", "name")
inner_join(songs, artists,by="name") %>% inner_join(test,by="name")
There are hundreds test
-like data.frames
that I want join.
Joins with dplyr. dplyr uses SQL database syntax for its join functions. A left join means: Include everything on the left (what was the x data frame in merge() ) and all rows that match from the right (y) data frame. If the join columns have the same name, all you need is left_join(x, y) .
inner_join(x, y): Return all rows from x where there are matching values in y, and all columns from x and y. If there are multiple matches between x and y, all combination of the matches are returned. This is a mutating join.
How to Join Multiple Data Frames in R?, you can find it useful to connect many data frames in R. Fortunately, the left join() function from the dplyr package makes this simple to accomplish. We can easily conduct two left joins, one after the other, to combine all three data frames.
You could collect the data frames in a list and use Reduce
:
L <- list(songs, artists, test)
Reduce(inner_join, L)
# name plays song song2
# 1 John guitar Across the Universe Across the Universe
# 2 John guitar Come Together Across the Universe
# 3 John guitar Across the Universe Come Together
# 4 John guitar Come Together Come Together
# 5 Paul bass Hello, Goodbye Hello, Goodbye
You can use L <- mget(ls())
(with an optional pattern
arg to ls
) to get everything into a list.
As @akrun mentioned in the comments, a plyr
alternative is:
library(plyr)
join_all(L, type='inner')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With