Scenario
Say you have a website or app that has tons of traffic. And even with a database connection pool, performance is taking a real hit (the site/app may even be crashing) because there are too many concurrent connections.
Question
What are someone's options for dealing with this problem?
My thoughts
I was thinking someone with this problem could create multiple databases (possibly on different machines although I'm not sure that's necessary), each with the same information and updated at the same time, which would grant a multiple of the original number of connections for a single database. But if the database is large that doesn't seem like a very viable solution.
What happens if connection pool is full?
If the maximum pool size has been reached and no usable connection is available, the request is queued. The pooler then tries to reclaim any connections until the time-out is reached (the default is 15 seconds). If the pooler cannot satisfy the request before the connection times out, an exception is thrown.
What are some of the main issues with using connection pools?
One of the most common issues undermining connection pool benefits is the fact that pooled connections can end up being stale. This most often happens due to inactive connections being timed out by network devices between the JVM and the database. As a result, there will be stale connections in the pool.
The stem is not specific enough to give a firm suggestion, but the complete list of what could be done is as follow:
-
Database cluster: Suitable for situations where you don't want to change your application layer and database is all you touch. There's a limit on how much you can get out of a database cluster. If your request volume keeps on growing, this solution will fail as well eventually. But the good news is that you've got all the functionality you've already had in an ordinary single-instance MySQL.
-
Sharding: Since your question is tagged with MySQL, and it does not support sharding on its own, if you want to use this solution you need to implement it in your application layer. In this solution you'll scatter your data over multiple databases (preferably in multiple MySQL instances on separate hardware) logically. It will be your responsibility to find the appropriate database holding your designated data. It's one of the most effective solutions ever but it's not always feasible. Its biggest flaw is that data scattered among two or more databases can not be included within a transaction.
-
Replication: Depending on your scenario you might be able to incorporate database replication and have copies of your data on them. This way you can connect to them instead of the master database and reduce the load on it. The default replication definition is master/slave scenario in which data flow is one way, from master to the slave. So changes you might make on the slave while will be applied on the salve, they won't be affecting the master. But there is also a master/master replication configuration in which data flow is in both ways. Yet you can not assume atomic integrity for concurrent data changes among both masters. In the end this solution is most effective if you plan to use it in master/slave mode and using slaves for read-only access.
-
Caching: Perhaps this solution should not be included here but since your stem does not reject it, here it goes. One of the ways to reduce database load is to cache its data once extracted. This solution can be beneficial specially if extracting data is expensive. There are many cache servers out there, like memcached or redis. This way you can omit so many of the database connections but only for extraction of data.
-
Other storage engines: You can always switch to more performant engines if your current one does not provide you with what you need. Of course this is only feasible if your needs allow you to. Nowadays there are NoSQL engines, much more performant than RDBMS, which support sharding natively and you can scale them linearly with minimum effort. There are also Lucene based solutions out there with powerful full-text search capabilities providing you with the same automatic sharding. In fact the only reason why you should be using a traditional RDBMS is the atomic behavior of transactions. But if transactions are not a must, there are much better solutions than RDBMS.
If you don't already, you could try running your application on an application server -- to get some middleware behind your app. Most application servers will do their own connection pooling (because getting a connection from a web app to a database connection pool is still really really expensive). Additionally, you should be able to configure your application server to use shared connections -- which as the name implies will allow connections to be shared wherever possible.
In short, use an appserver. If you already are, maybe mention which one you're using and we can look at optimizing the server config from there.