I'm building an R package, the main purpose of which is to abstract away the pain of dealing with a proprietary database that requires some fairly complex SQL queries in order get data out.
As such, the connection to the Microsoft SQL Server (obtained by odbcDriverConnect
) is a constant and important part of this package, but I can't work out how best to manage this and I'm hoping for advice as to how this should be implemented in R.
My current thoughts are:
Make the user ensure they have a valid connection before they call any function. Each function then has connection
as a parameter which must be passed. This puts a burden on the user.
In every function, make a call to get.connection()
which will get new connection each time. Old connections are then allowed to timeout naturally, which seems a sloppy approach.
As above, but return the same connection each time. This appears not to be a viable proposition as I can't prevent connections from timing out through R. autoReconnect=TRUE
and other tricks I've used in different languages seem to have no effect.
In Java, I would probably have a DatabaseConnectionPool populated with a number of connections and simply grab connections from, and return them to, that pool as needed. I also don't seem to have the timeout issue in Java when I specify autoReconnect=TRUE
.
Any suggestions much appreciated.
pool
is an R package for pooling connections such as databases. If you're cool to use a github package, take a look at https://github.com/rstudio/pool. Will reuse or recreate the connection as required.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With