Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deploying a high availability Postresql 9.0 on Amazon EC2 with PGPool-ii

We have an existing web application that uses Postgresql 9.0 and PGPool-ii. I am thinking of migrating our infrastructore to Amazon EC2 and was inspired by the following link: http://aws.typepad.com/aws/2008/12/running-everything-on-aws-soocialcom.html that uses a similar architecture.

Since Amazon RDS doesn't support PGSQL, we are going to stick with PGPool-ii to load-balance the queries on the different DB servers and keep them synchronzed between each others.

So we plan to deploy 3 frontend web servers which will contain the following : - Web Server + PHP code - PGPool-ii

Then, we would have 2 database servers on separate Amazon instances with only PGSQL. These 2 PG servers would be used by the PGPools located on the 3 frontend servers.

My question is that I don't know if this solution is reliable enough as multiple PGPool will access multiple PGSQL servers. Most examples of PGPool demonstrates a single PGPool that uses N underlying PGSQL servers. Is it a good pratice to deploy a PGPool instance on each web server ?

If not, is there any other/better architecture to avoid having SPOF using Amazon ?

Thank you very much for your replies.

like image 769
Mike Avatar asked Aug 04 '11 11:08

Mike


People also ask

What is Pgpool in PostgreSQL?

Pgpool-II is a middleware that works between PostgreSQL servers and a PostgreSQL database client. It is distributed under a license similar to BSD and MIT. It provides the following features. Connection Pooling.

Is EC2 High Availability?

Compute—Amazon EC2 and other services that let you provision computing resources, provide high availability features such as load balancing, auto-scaling and provisioning across Amazon Availability Zones (AZ), representing isolated parts of an Amazon data center.

How do I install pgpool2?

Installation of PGPool-II : The following steps have been performed on server A. The PostgreSQL version used in this setup is 12, so the pg12* of PGPool-II is used for libraries and extension directories of PostgreSQL 12. Replace step 2 pg12* with the appropriate version of PostgreSQL.


1 Answers

Couple of thoughts. First, we avoid SPOF for things like PGPool through the use of Heartbeat, Pacemaker and an ElasticIP. Run two (or more) instances dedicated to PGPool. Assign an ElasticIP to one of them. Setup Heartbeat and Pacemaker to monitor PGPool. On failover, have Pacemaker run a script that assigns the ElasticIP to new master (DC in Pacemaker terms). If you're only running two nodes, make sure that you disable quorum functionality in Pacemaker, because you can't have a quorum if one node goes down out of a total of two nodes.

To take advantage of the ElasticIP, do a reverse DNS lookup on your ElasticIP from outside of Amazon. This will give you a DNS name that maps to the ElasticIP which should end in amazonaws.com. DNS lookups from an EC2 instance for a domain name ending in amazonaws.com will actually resolve to the internal IP address for the instance that has been assigned the ElasticIP. You can either point your application servers directly at the DNS for the ElasticIP or, assuming you're running your own DNS, you can create a CNAME that refers to the ElasticIP DNS.

That said, there's one big catch to using ElasticIPs for failover. Re-assigning the ElasticIP takes up to 120 seconds to take effect. Most of the time is spent waiting fo thte change to propagate through Amazon's DNS servers.

Also, while I have not tried running PGPool-ii on each Application Server, I'm not sure this would work. If the master database fails, I think each of the PGPool instances would be competing to handle the failover. Maybe I'm just not familiar enough with PGPool-ii to understand the best way to handle that.

As far as the person who mentioned plproxy, I think they have it confused with PGBouncer, which is recommend for use with plproxy. plproxy is a partitioning system, not a load balancer. That said, PGBouncer is not a load balancer either - it's a connection pooling system. PGBouncer does not provide load balancing functionality. In fact, the FAQ for PGBouncer explicitly recommends using a TCP load balancer like HAProxy.

In addition, the statements about Amazon having vertical scalability problems that Rackspace solves are incorrect. With Amazon EC2 instances you can always stop an instance and upgrade it to a larger instance type. Neither Amazon nor Rackspace support changing instance types on the fly.

like image 194
organicveggie Avatar answered Oct 01 '22 17:10

organicveggie