Most code examples showing how to use dplyr with a database involve creating a database connection object:
connStr <- "driver=driver;server=hostname;database=mydatabase;..."
db <- DBI::dbConnect(odbc::odbc(), .connection_string=connStr)
tbl <- tbl(db, "mytable")
tbl %>% verb1 %>% verb2 %>% ...
However, suppose I omit creating a db
object:
tbl <- tbl(DBI::dbConnect(odbc::odbc(), .connection_string=connStr), "mytable")
tbl %>% verb1 %>% verb2 %>% ...
Are there any consequences to this? Will I use up database resources/leak memory/etc?
The DBMS I have in mind is SQL Server, and the driver package is odbc, in case it matters.
The new DBI specs assume that the caller frees all connection they allocate with dbConnect()
with a corresponding call to dbDisconnect()
. Failure to do so will close the connection only during garbage collection (or the end of the R session), thus delaying liberation of resources, or even leak the connection.
The exact behavior depends on the DBI backend involved (in this case the odbc package). According to Jim Hester, the maintainer of odbc,
[it] automatically calls
dbDisconnect()
when the connection object is garbage collected, so this won't leak connections. If you are opening a large number of connections it is always best to be explicit, if you are just doing this interactively it is probably ok to rely on the garbage collector in this case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With