I'm trying to analyze data stored in an SQL database (MS SQL server)
in R
, and on a mac. Typical queries might return a few GB of data, and the entire database is a few TB. So far, I've been using the R package odbc
, and it seems to work pretty well.
However, dbFetch()
seems really slow. For example, a somewhat complex query returns all results in ~6 minutes in SQL server, but if I run it with odbc
and then try dbFetch
, it takes close to an hour to get the full 4 GB into a data.frame
. I've tried fetching in chunks, which helps modestly: https://stackoverflow.com/a/59220710/8400969. I'm wondering if there is another way to more quickly pipe the data to my mac, and I like the line of thinking here: Quickly reading very large tables as dataframes
What are some strategies for speeding up dbFetch
when the results of queries are a few GB of data? If the issue is generating a data.frame
object from larger tables, are there savings available by "fetching" in a different manner? Are there other packages that might help?
Thanks for your ideas and suggestions!
I would suggest using the dbcooper found on github. https://github.com/chriscardillo/dbcooper
I have found huge improvements in speed when querying large datasets.
Firstly, Add your connection to your environment.
conn <- DBI::dbConnect(odbc::odbc(),
Driver = "",
Server = "",
Database = "",
UID="",
PWD="")
devtools::install_github("chriscardillo/dbcooper")
library(dbcooper)
dbcooper::dbc_init(con = conn,
con_id = "test",
tables = c("schema.table"))
This adds the function test_schema_table()
to your environment which is used to call the data. To collect into your environment use scheme_table %>% collect()
Here is a microbenchmark I did to compare the results of both DBI and dbcooper.
mbm <- microbenchmark::microbenchmark(
DBI = DBI::dbFetch(DBI::dbSendQuery(conn,qry)),
dbcooper = ava_qry() %>% collect() , times=5
)
Here are the results of a microbenchmark I did to compare DBI with dbcooper.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With