Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to connect an on-premises application to AWS Aurora Serverless

We have a bunch of on-premises applications each running their own local MySQL servers. Our workload is light, with occasional bursts of activity (a B2B business model with some specific times of the month in which it is more profitable to use our application, and therefore we see usage spikes during those days). We decided that it would be a good idea to simplify the infrastructure by moving all the databases into one server/cluster, and after some discussion decided that buying a managed solution would be better than trying to set up and maintain our own MySQL cluster (none of us are DBAs).

We did a thorough amount of research, and eventually settled on Amazon Aurora Serverless as a solid candidate for its auto-scaling capabilities, and therefore (potentially) lower cost compared to the alternatives we examined (AWS MySQL RDS and DigitalOcean managed MySQL), due to our usually-light workload with occasional bursts of activity.

However, from what I can gather it is impossible to just simply connect to AWS Aurora Serverless (see Not able connect Amazon Aurora Serverless from SQL client for example) from our on-premises applications, so my question is:

  1. What is the best-practice, modern way to solve this problem - should we use a site-to-site VPN to connect our on-premises hosts to the cloud? Would this end up costing us significantly more?
  2. Is Aurora Serverless really the best solution at all, or should we fall back to Amazon RDS, or DigitalOcean's managed MySQL cluster, both of which allow assigning public IPs but neither of which will auto-scale (meaning we'd need to buy a tier based on our peak usage, and potentially waste a lot of money as it will sit almost idle for a large part of the month)?

What we want to achieve is a simple, fire-and-forget MySQL cluster set up that's managed by someone else, ideally auto-scales, and doesn't cost the earth or end up being more difficult to manage than the current, on-premises solution.

We are not cloud-averse, but neither do we want to suddenly start moving everything into the cloud all at once just for the sake of a simpler database infrastructure.

To throw an extra spanner into the works, we don't manage our own firewalls - so setting up a site-to-site VPN could be tricky and involve coordinating with a third party (our network provider). Ideally I'd like to avoid this hassle too, if at all possible.

like image 876
Chris Browne Avatar asked Aug 14 '20 17:08

Chris Browne


2 Answers

I understand that you have some questions around hybrid cloud architectures with regard to Amazon Aurora Serverless. This is a really tough topic and might easily be seen as opinionated (luckily the community left this open though). So, I try to reference as much public material as possible and try to explain my thoughts if I had to design this kind of setup.

As a disclaimer, I am not an AWS official. However, I was building and operating cloud applications in the startup industry for the last three years... And coincidentally I have a couple of minutes, so here are my thoughts:

1. Problem Statement

Aurora Serverless is accessbile through VPC Interface Endpoints [1]:

Each Aurora Serverless DB cluster requires two AWS PrivateLink endpoints. If you reach the limit for AWS PrivateLink endpoints within your VPC, you can't create any more Aurora Serverless clusters in that VPC.

According to the docs [1], as you already pointed out correctly, these endpoints are a private construct:

You can't give an Aurora Serverless DB cluster a public IP address. You can access an Aurora Serverless DB cluster only from within a virtual private cloud (VPC) based on the Amazon VPC service.

2. Question Scope

Your questions involve the best-practices (Q1), the cost aspects (also Q1) and the functional differences to other database options in the cloud (Q2), e.g. public access via the internet and auto scaling.

These are all valid questions when migrating database workloads into the public cloud. But at the same time, they are only a subset of questions that should be considered.
As far as I understand, we have three challenges here that should be clearly highlighted: You are (CI) initiating a migration to the cloud, (CII) you are about to modify your existing workload to be a hybrid workload and (CIII) you are performing a database migration. All three are generally big topics on their own and it should not be decided upon them prematurely. However, if your workload is, as you described "light", the risk of doing them all together might be acceptable. That is not something that I am able to discuss in the following.

So let's focus on the very basic question which comes into my mind when I look at challenges (C1) - (C3) described above:

3. Is a hybrid workload acceptable? (C2)

I think the main question you should ask yourself is whether the on-premise workload can be transformed into a hybrid workload. Consequently you should think about the impact of placing your database far away from your clients with regard to latency and reliability. Furthermore you should evaluate if the new database engine fits your performance expectations (e.g. scaling up fast enough for peek traffic) [3] and whether database compatibility and limitations are acceptable [4].

Usually a connection into the cloud (possibly over an external network carrier) is less reliable than a bunch of cables on-premises. Maybe your workload is even that small, that the DB and its clients are running on the same hypervisor/machine. In that case, moving things far apart (connected over a 3rd party network), should definitely be considered carefully.

It is a fact, that for a workload to be reliable and/or highly available, not only Aurora has to meet these standards (which it does), but your network connection too.

When you ask yourself the right questions, you automatically start to characterise your workload. AWS published a bunch of public guidelines to aid you in this process.
There is the Well Architected Framework [10] and the Well-Architected Tool [11] - the latter one being the "automated" way to apply the framework. As an example, the Reliability Pillar [9] contains some thoughts and expertise from AWS experts to really question your hybrid approach.

Moreover, AWS publishes so called Lenses [13] to discuss specific workload types from the well-architected perspective. As you asked for the best-practices (Q1), I want to point out that currently there is no detailed guideline/lens for the type of workload you described.

However, there is an Aurora guide called "Performing a Proof of Concept with Amazon Aurora" in the docs [12]. (more information below in section "Aurora POC Guide")

I worked on applications in the past which use the database layer heavily and thus could not undergo a change like that without a major refactoring...
Which brings me to the second point: Migration Strategy.

4. What is the acceptable migration strategy? (C1)

Since this is a database migration, there are two major questions you should ask yourself: (a) to what degree do you want to migrate (called the 6R's of migration - a general concept which is independent from databases) and (b) how to lift the database parts into the cloud (especially data). I do not want to go into detail here since it is highly dependent on your workload characteristics.

AWS has published a detailed guideline which aids you with these decisions. [15]
It mentions some useful tools such as the DMS and SCT which help you to convert your schema properly (if necessary) and to move your data from source database cluster into target database cluster (optionally in a "online"/"live" migration manner without downtime).

I want to highlight once again that there is a major decision you have to make: replatforming vs. rearchitecting the application (i.e. the database clients) I guess you can make Aurora Serverless work with only a small amount of changes, but in order to take full advantage of Aurora capabilities, probably a rearchitecting is necessary (which will maybe end in moving the whole workload into the cloud anyway).

If you decide to do a partial refactoring of your application, you could use the so called Data API as well. The Data API for Aurora Serverless [7][8] makes it possible to send queries directly over the public internet. It might be a valid fit for you if (i) you can afford to refactor some parts of your application code and (ii) your application's characteristics fit the Data API. The Data API has a completely new approach to database connection management and thus suits some serverless use cases very well. However, this might not apply to some traditional database workloads with long-hold / heavily used connections. You should also note the database engine compatibility for Data API ("Availability of the Data API" [12]).

5. Decision Making

I think technically it should be no issue to access Aurora Serverless. You have bascially four connectivity options: (a) directly over the internet, (b) over an AWS managed (site-to-site) VPN connection, (c) over an EC2 instance based VPN connection and (d) over Direct Connect (abbreviated DX).

  • Option (a) is only possible if you rearchitect your application to work with the Data API AFAIK.
  • Option (d) should be supported but is the most expensive according to fixed costs. It should be supported because AWS Interface Endpoints (the entry points into Aurora Serverless) are accessbile via DX.
  • Option (c) should be supported according to experts here on SO. [19]
  • Option (b) was certainly not supported at the beginning - but as far as I understand, could be now. This is because AWS PrivateLink (the technology underpinning AWS Interface Endpoints) supports connections from on-premises via AWS managed VPN since September 2018. [17]

Additionally, you possibly have to forward DNS queries from on-premises into the cloud in order to resolve the VPC Interface Endpoints properly. [18]

You should characterise your workload, specify the minimal requirements with regard to security, reliability, performance (see Well-Architected Framework) and finally look at the most cost-effective approach to accomplish it. In a B2B model, I would not compromise these three to achieve cost reduction (see my opinion in the section below).

You have basically two options to decide:

  1. doing the work on your own (which is hopefully a bit easier with the material referenced in this post)
  2. asking AWS or an external company for help from an AWS Solutions Architect

This is purely a tradeof between (1) the time it takes to figure all this out and get it working, (2) the costs (i.e. operating costs for the implemented solution and costs for consultation), (3) the financial risk involved when something goes wrong during the migration.

As you state in the question "moving everything into the cloud", I guess you are at the beginning of the cloud journey. The official AWS papers state the following for companies in that situation:

If your business is new to AWS, consider a managed service provider, such as AWS Managed Services, to build out and manage the platform. [14]

Having a background from the startup industry, I understand that this is not an option by any means for smaller companies - but just wanted to mention that the option exists.

6. Conclusion / My Opinion(!)

Exposing a database to the internet is a practice best avoided. That is not just my own opinion, but those of other's here on SO too. [19]

I would try to go (as a bare minimum!) with the AWS managed VPN approach and setting up a redundant VPN connection between on-premises and the cloud.

Why do I state "bare minumum"?
Because a proper solution would probably be, to move the whole workload into the cloud. However, if this is not possible, I would try to reduce the risk involved in establishing a hybrid workload. A managed VPN connection is probably the most cost-effective way for small workloads to reduce the risk from a security perspective.

From my experience:
For the last three years, I operated a SaaS application which was fully built in the AWS cloud. We had several outages of our network carrier since then. I would never trust them enough to establish some sort of hybrid architecture. Not for the type of workload we are offering (SaaS Webapp in B2B sector) and the internet contract/connectivity we have ATM. Never. However, the situation might be a completely different one for you - especially if you are already hosting services from your datacenter/office without reliability issues for a long time.

If you read until here, you probably ask yourself why someone would ever want to write such an essay. Well, I am just preparing for the AWS Certified Database Specialty [20] and this is a good opportunity to do some serious reasearch, take some notes and collect some sources/references. I want to endorse the various AWS Certification Paths [16] and the eco system of learning platforms around it. There is so much very informative stuff published by AWS.

Hopefully you found something interesting in this post for yourself too.

A. Aurora POC Guide

The guide mentions that when doing a database migration to Aurora, one should consider to:

  • rewrite some parts of the client application code - especially to properly use the DNS endpoints [5][6] and the connection pooling [5]

  • do a schema conversion if migrating from a rather complex (proprietary) source DB engine ("Port Your SQL Code" [12])

  • (optionally) incorporate some Aurora-specific changes to make the migrating more effective (applicable to a Rearchitect type of migration) [2]:

    • To take full advantage of Aurora capabilities for distributed parallel execution, you might need to change the connection logic. Your objective is to avoid sending all read requests to the primary instance. The read-only Aurora Replicas are standing by, with all the same data, ready to handle SELECT statements. Code your application logic to use the appropriate endpoint for each kind of operation. Follow these general guidelines:
    • Avoid using a single hard-coded connection string for all database sessions.
    • If practical, enclose write operations such as DDL and DML statements in functions in your client application code. That way, you can make different kinds of operations use specific connections.
    • Make separate functions for query operations. Aurora assigns each new connection to the reader endpoint to a different Aurora Replica to balance the load for read-intensive applications.
    • For operations involving sets of queries, close and reopen the connection to the reader endpoint when each set of related queries is finished. Use connection pooling if that feature is available in your software stack. Directing queries to different connections helps Aurora to distribute the read workload among the DB instances in the cluster.

References

[1] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.html#aurora-serverless.limitations
[2] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-poc.html#Aurora.PoC.Connections
[3] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-poc.html#Aurora.PoC.Measurement
[4] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless.html#aurora-serverless.limitations
[5] https://d1.awsstatic.com/whitepapers/RDS/amazon-aurora-mysql-database-administrator-handbook.pdf
[6] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Aurora.Connecting.html
[7] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/data-api.html
[8] https://www.youtube.com/watch?v=I0uHo4xAIxg#t=12m30s
[9] https://d1.awsstatic.com/whitepapers/architecture/AWS-Reliability-Pillar.pdf
[10] https://aws.amazon.com/architecture/well-architected/
[11] https://aws.amazon.com/de/well-architected-tool/
[12] https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-poc.html
[13] https://aws.amazon.com/blogs/architecture/well-architected-lens-focus-on-specific-workload-types/
[14] https://d1.awsstatic.com/whitepapers/Migration/aws-migration-whitepaper.pdf
[15] https://docs.aws.amazon.com/prescriptive-guidance/latest/database-migration-strategy/database-migration-strategy.pdf
[16] https://aws.amazon.com/training/learning-paths/
[17] https://aws.amazon.com/about-aws/whats-new/2018/09/aws-privatelink-now-supports-access-over-aws-vpn/
[18] https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver-forwarding-inbound-queries.html
[19] https://stackoverflow.com/a/52842424/10473469
[20] https://aws.amazon.com/de/certification/certified-database-specialty/

like image 105
Martin Löper Avatar answered Oct 06 '22 06:10

Martin Löper


You are correct, you can't directly connect to Aurora Serverless (AS) from outside AWS. The reason is that AS cannot be public. From docs:

You can't give an Aurora Serverless DB cluster a public IP address. You can access an Aurora Serverless DB cluster only from within a virtual private cloud (VPC) based on the Amazon VPC service.

AS has also many other limitations that you should be aware, some of them are: no replicas nor no IAM authentication.

Q1 Connection to AS

There are several options which are used to connect to SA, or other services not accessible from the internet (e.g. RDS Proxy, ElasticSearch domain).

Bastion/jump host

The cheapest, most ad-hoc option used mostly employed for testing and development, is by using a bastion/jump host. Using this option you would setup ssh tunnels to the bastion which in turn would connect you to the AS.

However, this is obviously not suitable for reliable access, but I feel this should be at least mentioned in the answer.

AWS Site-to-Site VPN

AWS Site-to-Site VPN is other option, as you already mentioned. This is obviously better way of enabling access from on-prem to VPC.

But the concern is the cost, as your charged $0.05 per hour and per data transfer.

The price per hour is not that much. For 1 month it is about $3.6/month:

24 hours x 30 days x $0.05 = $3.6

Data transfer is more difficult to estimate, as its depends on your actual requirements. For example, if you estimate that you will be getting 100GB of data out of the AS a month (inbound traffic is free), you will pay about $8.91 per month (first 1GB is free):

99GB * $0.09 = $8.91

Assuming the above scenarios, you will be paying about $12.51 / month. This does not include AS price itself.

However, due to mentioned issues with the firewall setup, this may be more trouble to setup and manage then be beneficial.

Direct Connect

AWS Direct Connect is most expensive, but most reliable and private. Just wanted to mention it, as probably this is not suited for your use-case.

Q2 Suitability of AS

One of the use-cases of AS is Infrequently used applications:

You have an application that is only used for a few minutes several times per day or week, such as a low-volume blog site. With Aurora Serverless, you pay for only the database resources that you consume on a per-second basis.

Also you need to take into account AS cold starts, which may be problematic as reported here or here for example.

Its not clear from your question exactly what would be the usage pattern of AS, or if cold starts would be problematic. But based on the stated issues with lack of public access to AS, difficulties in setting up VPN due to firewall, I would lean towards using regular Aurora MySQL or RDS (can't really comment on DigitalOcean).

The reasons are that you can have public access to it, its very fast to setup, pricing is known, there are no cold start issues, and its a managed service. Also, its support autoscaling for storage, so you won't need to worry about it.

What's more you can start with a small db instance (t3. small, or smaller), and then up-size when needed, or add read replicas to off-load read intensive workloads.

Example costs would be:

  • Aurora MySQL, t3.small and 100 GB of initial storage $39.93 (details here):

  • RDS MySQL, t3.small and 100 GB: $36.32 (details here).

The above does not include any read replicas, Multi-AZ setup or other extra features provided by RDS or Aurora. You can use calculator.aws to perform your own estimations based on your individual need. For RDS you can use even smaller instance that t3.small, e.g. t2.micro.

At the same time, exposing your production level database over internet is not generally recommended. So you end up again with either keeping it private and using VPN to access it privately over the internet. But with properly set security groups and network ACLs you could limit its public access to IP range of individual workstations or your workplace. This would reduce the risk of having public IP for the database if VPN is not really an option.

P.S.

I would recommend independently verify the prices and details provided, as mistakes are possible.

like image 32
Marcin Avatar answered Oct 06 '22 08:10

Marcin