Pgbouncer: how to run within a kubernetes cluster properly

Tags:

The background: I currently run some kubernetes pods with a pgbouncer sidecar container. I’ve been running into annoying behavior with sidecars (that will be addressed in k8s 1.18) that have workarounds, but have brought up an earlier question around running pgbouncer inside k8s.

Many folks recommend the sidecar approach for pgbouncer, but I wonder why running one pgbouncer per say: machine in the k8s cluster wouldn’t be better? I admit I don’t have enough of a deep understanding of either pgbouncer or k8s networking to understand the implications of either approach.

EDIT:

Adding context, as it seems like my question wasn't clear enough.

I'm trying to decide between two approaches of running pgbouncer in a kubernetes cluster. The PostgreSQL server is not running in this cluster. The two approaches are:

Running pgbouncer as a sidecar container in all of my pods. I have a number of pods: some replicas on a webserver deployment, an async job deployment, and a couple cron jobs.
Running pgbouncer as a separate deployment. I'd plan on running 1 pgbouncer instance per node on the k8s cluster.

I worry that (1) will not scale well. If my PostgreSQL master has a max of 100 connections, and each pool has a max of 20 connections, I potentially risk saturating connections pretty early. Additionally, I risk saturating connections on master during pushes as new pgbouncer sidecars exist alongside the old image being removed.

I, however, almost never see (2) recommended. It seems like everyone recommends (1), but the drawbacks seem quite obvious to me. Is the networking penalty I'd incur by connecting to pgbouncer outside of my pod be large enough to notice? Is pgbouncer perhaps smart enough to deal with many other pgbouncer instances that could potentially saturate connections?

504

asked Feb 08 '20 23:02

SirensOfTitan

1 Answers

We run pgbouncer in production on Kubernetes. I expect the best way to do it is use-case dependent. We do not take the sidecar approach, but instead run pgbouncer as a separate "deployment", and it's accessed by the application via a "service". This is because for our use case, we have 1 postgres instance (i.e. one physical DB machine) and many copies of the same application accessing that same instance (but using different databases within that instance). Pgbouncer is used to manage the active connections resource. We are pooling connections independently for each application because the nature of our application is to have many concurrent connections and not too many transactions. We are currently running with 1 pod (no replicas) because that is acceptable for our use case if pgbouncer restarts quickly. Many applications all run their own pgbouncers and each application has multiple components that need to access the DB (so each pgbouncer is pooling connections of one instance of the application). It is done like this https://github.com/astronomer/airflow-chart/tree/master/templates/pgbouncer

The above does not include getting the credentials set up right for accessing the database. The above, linked template is expecting a secret to already exist. I expect you will need to adapt the template to your use case, but it should help you get the idea.

We have had some production concerns. Primarily we still need to do more investigation on how to replace or move pgbouncer without interrupting existing connections. We have found that the application's connection to pgbouncer is stateful (of course because it's pooling the transactions), so if pgbouncer container (pod) is swapped out behind the service for a new one, then existing connections are dropped from the application's perspective. This should be fine even running pgbouncer replicas if you have an application where you can ensure that rarely dropped connections retry and make use of Kubernetes sticky sessions on the "service". More investigation is still required by our organization to make it work perfectly.

answered Oct 01 '22 16:10

Steven Miller

Related questions
                            
                                How to use Postgresql enum type via Kotlin Exposed ORM?
                            
                                Postgres - join on array values
                            
                                Delayed indexing in postgres
                            
                                Postgres Creating JSON Object from Aggregated Rows
                            
                                Django unable to migrate PostgreSQL: constraint X of relation Y does not exist
                            
                                DBLINK vs Postgres_FDW, which one may provide better performance?
                            
                                High Sierra + Python + Postgresql error: Illegal instruction: 4
                            
                                How does postgresql lock tables when inserting and selecting?
                            
                                How to create buckets and groups within those buckets using PostgresQL
                            
                                PostgreSQL - create an auto-increment column for non-primary key
                            
                                How to compare numeric in PostgreSQL JSONB
                            
                                Django ORM raw delete query not deleting records
                            
                                Create a timestamp with time zone in PostgreSQL from Liquibase XML
                            
                                Reset identity column with last value of table's identity in postgres
                            
                                Spring Docker container cannot access Postgres Docker container
                            
                                operator does not exist: json @> unknown
                            
                                Performance impact of adding unique constraint to existing postgres index
                            
                                How to change Postgresql max_connections config via Kubernetes statefulset environment variable?
                            
                                org.postgresql.util.PGobject not available in org.postgresql
                            
                                Permission denied when trying to load into Postgres RDS from S3 with a path that contains the equals sign

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pgbouncer: how to run within a kubernetes cluster properly

Tags:

postgresql

kubernetes

pgbouncer

SirensOfTitan

People also ask

1 Answers

Steven Miller

Recent Activity

Donate For Us