dplyr (version 0.4.1) prints the colnames by which it is performing the join. Is it possible to turn this option off?
R code:
library(dplyr) a=data.frame(x=1,y=2) b=data.frame(x=1,z=10) aa=inner_join(a,b)
for the last line, dplyr prints:
Joining by: "x"
that is nice for interactive work, but I am running in Rscript and all these lines are clogging my screen.
To join by different variables on x and y , use a named vector. For example, by = c("a" = "b") will match x$a to y$b . To join by multiple variables, use a vector with length > 1. For example, by = c("a", "b") will match x$a to y$a and x$b to y$b .
Joins with dplyr. dplyr uses SQL database syntax for its join functions. A left join means: Include everything on the left (what was the x data frame in merge() ) and all rows that match from the right (y) data frame. If the join columns have the same name, all you need is left_join(x, y) .
A mutating join allows you to combine variables from two tables. It first matches observations by their keys, then copies across variables from one table to the other. Like mutate() , the join functions add variables to the right, so if you have a lot of variables already, the new variables won't get printed out.
full_join() return all rows and all columns from both x and y . Where there are not matching values, returns NA for the one missing. return all rows from x where there are matching values in y , keeping just columns from x .
If you want to be heavy-handed, you can do
aa = suppressMessages(inner_join(a, b))
The better choice, as Jazzurro suggests, is to specify the by
argument. dplyr
only prints a message to let you know what its guess is for which columns to join by. If you don't make it guess, it doesn't confirm things with you. This is a safer choice as well, from defensive coding standpoint.
If this is in a knitr
document, you can set the chunk option message=FALSE
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With