Given 2 remote tables (simulated with tbl_lazy
for this example)
library("dplyr")
library("dbplyr")
t1 <- tbl_lazy(df = iris, src = dbplyr::simulate_mysql())
t2 <- tbl_lazy(df = mtcars, src = dbplyr::simulate_mysql())
How can I perform an actual* cross join between t1 and t2 using R and dbplyr?
* i.e. using CROSS JOIN
in the translated SQL query
Note that I know how to perform all the other types of joins, this is precisely about CROSS joins.
I am aware of the following trick:
joined <- t1 %>%
mutate(tmp = 1) %>%
full_join(mutate(t2, tmp = 1), by = "tmp") %>%
select(-tmp)
However
CROSS JOIN
. Using show_query(joined)
shows that the generated SQL query uses LEFT JOIN
.Sadly, there is no cross_join
operator in dplyr
and sql_join(t1, t2, type = "cross")
does not work either (not implemented for tbl
s, works only on DB connections).
How can I generate an SQL CROSS JOIN
with dbplyr?
According to the dbplyr
NEWS file, since version 1.10, if you use a full_join(..., by = character())
, it will "promote" the join to a cross join. This doesn't seem to be documented anywhere else yet, but searching the dbplyr Github repo for "cross" turned it up in both code and the NEWS file.
This syntax does not seem to work for local data frames, only via SQL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With