Is there a way to use dplyr::bind_rows without collecting data frames from the database?

Question

Is there a way to use bind_rows() on a set of data frames without first collecting them from the database?

Say I've defined a couple dplyr query tables:

mydatabase  <- src_mysql('database')
table1  <- tbl(mydatabase,"table1")
table2  <- tbl(mydatabase,"table3")

foo  <-  table1 %>% filter(id > 10) %>% select(id)
bar  <-  table2 %>% select(id)

I'd like to be able to join foo and bar together--in essence, I'd like to perform a union on the two subqueries without having to drop to SQL. However, when I try that, I get an error because I'm trying to join two tbl_sql objects, rather that real data frames:

unioned_data_frame  <- bind_rows(foo,bar)

Error: incompatible sizes (1 != 8)

Any suggestions? In this toy example, writing the whole query in SQL wouldn't be a problem, but of course, in real life, foo and bar are often significantly more complicated.

crazybilly · Accepted Answer

Using dplyr::union() will do the SQL union() action, although it's important to note that that dplyr::union() will remove duplicate rows (like the SQL version). Using dplyr::union_all() keeps duplicate rows like bind_rows().

Unfortunately, there isn't a way to get benefits of bind_rows(), particularly the very useful .id argument.

Is there a way to use dplyr::bind_rows without collecting data frames from the database?

Tags:

mysql

r

dplyr

crazybilly

1 Answers

crazybilly

Recent Activity

Donate For Us

Is there a way to use dplyr::bind_rows without collecting data frames from the database?

Tags:

mysql

r

dplyr

crazybilly

1 Answers

crazybilly

Related questions

Recent Activity

Donate For Us