Multi-AZ RDS test failover and connection monitoring

Tags:

My question has two parts:

What is the best way to initiate an RDS failover for testing purposes?
How can I monitor the connection during failover in order to observe the time that it takes for AWS to reconnect the user to the standby instance?

With respect to part (1): If I understand correctly, all instance modifications are made on the standby and then AWS fails over by flipping the CNAME over to the standby as the primary is updated, so if I were to make any kind of instance modification and select "apply immediately," it should cause a failover, correct?

With respect to part (2): I am looking specifically for a way of monitoring the failover of an Oracle RDS instance, whether through a lambda function, a bash script, or some other means. As far as I can tell, it is not possible to use ping with RDS, even when I allow all ICMP traffic via the security group. I can connect without trouble using telnet or an SQL client. What I would like though is some way of doing something like periodically pinging the database during a failover to see when the IP associated with the connection string switches over and how long it takes. Any suggestions?

784

asked Mar 08 '17 16:03

amparito

2 Answers

Correct, RDS will make your modifications on the failover instance and then failover to it. Per their documentation:

The availability benefits of Multi-AZ deployments also extend to planned maintenance and backups. In the case of system upgrades like OS patching or DB Instance scaling, these operations are applied first on the standby, prior to the automatic failover. As a result, your availability impact is, again, only the time required for automatic failover to complete.

To simulate failover, simply reboot with failover when rebooting, instead of rebooting both. From the linked documentation:

Reboot with failover is beneficial when you want to simulate a failure of a DB instance for testing, or restore operations to the original AZ after a failover occurs.

Write a script that, on a regular interval, connects with a SQL Client and performs a quick select on a table of your preference. You can use this to measure true downtime during the failover; we have a tool very similar to this that we use when getting estimates of modifications on a test RDS before we apply it to our production RDS. Our tool simply writes to console with a timestamp and whether it failed/succeeded every few seconds. The tool will write success before the reboot, failure during, and success again after the cutover completes.

Additional Resources:

Modifying an Amazon RDS DB Instance and Using the Apply Immediately Parameter
Modifying a DB Instance Running the Oracle Database Engine

171

answered Oct 18 '22 02:10

Anthony Neace

Update on this:

I ended up using a simple bash script:

date; while true; date; do nc -vz DBNAME.REGION.rds.amazonaws.com PORT; sleep 1; done

Note: the above is for netcat-openbsd. If using netcat-traditional, you'll need to modify this.

This polls the database each second to see if it's still possible to connect. Typically when I ran this and then initiated reboot with failover, the connection would simply dangle during the failover then display a timeout error when the failover was complete and connectivity resumed, presumably because the failover usually takes longer than the reboot. If the reboot happens to take longer than the failover though, there may be a period of time during which the connection is refused as the reboot completes. In any case, using this method, I was able to get a consistent failover time of 2:08.

It seeems, however, that unlike I originally thought, most instance modifications do not involve a failover at all. I have tested resizing the instance as well as changing the option groups and parameter groups and did not experience any downtime.

Changing the database engine does result in a failover.

answered Oct 18 '22 03:10

amparito

Related questions
                            
                                AWS CodeDeploy Github File Already Exist
                            
                                AWS Cloudfront and Route53
                            
                                How to detect state of aws instance from inside of itself?
                            
                                requestParameters returning "Invalid mapping expression specified: true"
                            
                                How to prevent an AWS lambda function from running more than once simultaneously?
                            
                                S3 US Standard region
                            
                                AWS SES - Logging SendEmail & SendRawEmail calls (SMTP)
                            
                                'AccessKeyId' error output when running aws s3 commands
                            
                                What price do I eventually pay for a AWS spot-instance and do I drive up the prices?
                            
                                AWS cloudwatch custom metric as elastic beanstalk autoscale trigger
                            
                                How to use mySQLworkbench to connect to RDS in AWS private subnet VPC
                            
                                AWS S3 : Do Lifecycle rules accept regex?
                            
                                Can AWS Cloudwatch alarms detect no activity?
                            
                                AWS API Gateway Binary output without Accept header
                            
                                boto3 searching unused security groups
                            
                                Block HEAD requests to AWS Elastic Beanstalk and Elastic Load Balancer
                            
                                AWS EC2 Application Load Balancer + Two-Way SSL?
                            
                                AWS container service: set max_map_count
                            
                                use AWS APIs with Python to use Polly Services
                            
                                ImportError: No module named custom storages - django-storages boto

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Multi-AZ RDS test failover and connection monitoring

Tags:

amazon-web-services

monitoring

amazon-rds

failover

amparito

People also ask

2 Answers

Anthony Neace

amparito

Recent Activity

Donate For Us