Assume I have the below setup on Heroku + Rails, with one web dyno and two worker dynos.
Below is what I believe to be true, and I'm hoping that someone can confirm these statements or point out an assumption that is incorrect.
I'm confident in most of this, but I'm a bit confused by the usage of client and server, "connection pool" referring to both DB and Redis connections, and "worker" referring to both puma and heroku dyno workers.
I wanted to be crystal clear, and I hope this can also serve as a consolidated guide for any other beginners having trouble with this
Thanks!
A web dyno (where the Rails application runs)
A Worker dyno
An ActiveRecord pool size of 25 means that each dyno has 25 connections to work with. (This is what I'm most unsure of. Is it each dyno or each Puma/Sidekiq worker?)
For the web dynos, it can only run 10 things (threads) at once (2 puma x 5 threads), so it will only consume a maximum of 10 threads. 25 is above and beyond what it needs.
For worker dynos, the Sidekiq concurrency of 15 means 15 Sidekiq processes can run at a time. Again, 25 connections is beyond what it needs, but it's a nice buffer to have in case there are stale or dead connections that won't clear.
In total, my Postgres DB can expect 10 connections from the web dyno and 15 connects from each worker dyno for a total of 40 connections maximum.
The web dyno (Sidekiq client) will use the connection pool size
specified in the Sidekiq.configure_client
block. Generally ~3 is sufficient because the client isn't constantly adding jobs to the queue. (Is it 3 per dyno, or 3 per Puma worker?)
Each worker dyno (Sidekiq server) will use the connection pool size
specified in the Sidekiq.configure_server
block. By default it's sidekiq concurrency + 2, so here 17 redis connections will be taken up by each dyno
A connection pool synchronizes thread access to a limited number of database connections. The basic idea is that each thread checks out a database connection from the pool, uses that connection, and checks the connection back in.
Note: Sidekiq needs 2 connections by default for a bunch of stuff like a heartbeat, maintain sentinel, etc. Read here. That's why if you have max threads or concurrency of for ex. 5 then you need to have at least 7 connections in your pool.
Create a Procfile Push the changes to the repo and Heroku. In the Free Dynos section under the Resource tab, will appear the worker we add in the Procfile , enable it clicking the edit button to right. With this we have Sidekiq configured to use it on Heroku.
pool is the config of size of connection pool, which is 5 by default.
I don't know Heroku + Rails but believe I can answer some of the more generic questions.
From the client's perspective, the setup/teardown of any connection is very expensive. The concept of connection pooling is to have a set of connections which are kept alive and can be used for some period of time. The JDK HttpUrlConnection does the same (assuming HTTP 1.1) so that - assuming you're going to the same server - the HTTP connection stays open, waiting for the next expected request. Same thing applies here - instead of closing a JDBC connection each time, the connection is maintained - assuming same server and authentication credentials - so the next request skips the unnecessary work and can immediately move forward in sending work to the database server.
There are many ways to maintain a client-side pool of connections, it may be part of the JDBC driver itself, you might need to implement pooling using something like Apache Commons Pooling, but whatever you do it's going to increase your behavior and reduce errors that might be caused by network hiccups that could prevent your client from connecting to the server.
Server-side, most database providers are configured with a pool of n possible connections that the database server may accept. Usually each additional connection has a footprint - usually quite small - so based on the memory available you can figure out the maximum number of available connections.
In most cases, you're going to want to have larger-than-expected connections available. For example, in postgres, the configured connection pool size is for all connections to any database on that server. If you have development, test, and production all pointed at the same database server (obviously different databases), then connections used by test might prevent a production request from being fulfilled. Best not to be stingy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With