I have two large tables, old_customers and new_customers and I want to row bind them. This is done but either one is too large to load in-memory. However, I'm not sure if bind_rows
will allow binding of two tables without them being both data frames. I get this error when I try
old <- tbl(conn, 'old_customers')
new <- tbl(conn, 'new_customers')
old %>% bind_rows(new)
Error in bind_rows_(x, .id) :
Argument 1 must be a data frame or a named atomic vector, not a tbl_dbi/tbl_sql/tbl_lazy/tbl
The other option would be something like
old <- as.data.frame(tbl(conn, 'old_customers'))
new <- as.data.frame(tbl(conn, 'new_customers'))
old %>% bind_rows(new)
But again, I am trying to avoid loading this dataset into memory. Is there a way around this?
You can use the union_all
function, which operates very similarly to an rbind
or bind_rows
, but should have the added advantage of not necessitating the loading of data in memory.
old <- tbl(conn, 'old_customers')
new <- tbl(conn, 'new_customers')
old %>% union_all(new)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With