Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi-tenancy: Individual database per tenant

We are developing a multi-tenant application. With respect to architecture, we have designed shared middle tier for business logic and one database per tenant for data persistence. Saying that, business tier will establish set of connections (connection pool) with the database server per tenant. That means application maintain separate connection-pool for each tenant. If we expect around 5000 tenants, then this solution needs high resource utilization (connections between app server and database server per tenant), that leads to performance issue.

We have resolved that by keeping common connection pool. In order to maintain single connection pool across different databases, we have created a new database called ‘App-master’. Now, we always connect to the ‘App-master’ database first and then change the database to tenant specific database. That solved our connection-pool issue.

This solution works perfectly fine with on-premise database server. But it does not work with Azure Sql as it does not support change database.

Appreciate in advance to suggest how to maintain connection pool or better approach / best practice to deal with such multi-tenant scenario.

like image 588
Hitesh Avatar asked Aug 18 '14 17:08

Hitesh


1 Answers

I have seen this problem before with multiple tenancy schemes with separate databases. There are two overlapping problems; the number of web servers per tenant, and the total number of tenants. The first is the bigger issue - if you are caching database connections via ADO.net connection pooling then the likelihood of any specific customer connection coming into a web server that has an open connection to their database is inversely proportional to the number of web servers you have. The more you scale out, the more any given customer will notice a per-call (not initial login) delay as the web server makes the initial connection to the database on their behalf. Each call made to a non-sticky, highly scaled, web server tier will be decreasingly likely to find an existing open database connection that can be reused.

The second problem is just one of having so many connections in your pool, and the likelihood of this creating memory pressure or poor performance.

You can "solve" the first problem by establishing a limited number of database application servers (simple WCF endpoints) which carry out database communications on behalf of your web server. Each WCF database application server serves a known pool of customer connections (Eastern Region go to Server A, Western Region go to Server B) which means a very high likelihood of a connection pool hit for any given request. This also allows you to scale access to the database separately to access to HTML rendering web servers (the database is your most critical performance bottleneck so this might not be a bad thing).

A second solution is to use content specific routing via a NLB router. These route traffic based on content and allow you to segment your web server tier by customer grouping (Western Region, Eastern Region etc) and each set of web servers therefore has a much smaller number of active connections with a corresponding increase in the likelihood of getting an open and unused connection.

Both these problems are issues with caching generally, the more you scale out as a completely "unsticky" architecture, the less likelihood that any call will hit cached data - whether that is a cached database connection, or read-cached data. Managing user connections to allow for maximum likelihood of a cache hit would be useful to maintain high performance.

like image 115
PhillipH Avatar answered Oct 12 '22 13:10

PhillipH