Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Number of DB Connections vs Java Threads

I am currently developing a java application that compares tables' data present in 2 different databases.

I am using connection pooling and Thread pool Executor service. I made the number of connections and threads configurable and hence trying to find an optimal number of connections and optimal number of threads required.

I know that the best way to get the optimal number is by trying out different numbers but my question is to what factors should I consider or how to calculate the number of connections/threads require.

There are typically 3000 tables to compare and tables' list/schema is available upfront and for time being assume that number of records in each table is few hundreds(so I don't need to query a table more than once).

Currently, my application spawns one thread(from thread pool) per one table and it makes 2 different db connections to 2 different databases(sequentially for now) and once the data is retrieved , the same thread calls a method that compares the data.

Here are few questions I have, say N is no. of cores and M is max no. of db connections the dbs can take

  1. If I have more threads than N, will that be useful for my use case? if yes, how?
  2. What is the limiting factor here - No. of cores or no. of connections?
  3. Does having more threads than M is of any use?
like image 352
Vikash Talanki Avatar asked Oct 26 '25 06:10

Vikash Talanki


1 Answers

N is no. of cores and M is max no. of db connections the dbs can take

  1. If I have more threads than N, will that be useful for my use case? if yes, how?
  2. What is the limiting factor here - No. of cores or no. of connections?
  3. Does having more threads than M is of any use?
  1. Yes, spawning way more threads than cores will help, because at any given time some of the threads will be blocked doing I/O, at which time other threads can do processing.

  2. From the above it follows that the limiting factor is certainly not the number of cores. However, the number of connections may not be the limiting factor either. Of course you cannot exceed the number of connections, but you might find that you cannot even max that limit, in the sense that disk throughput (on the database server side) or network congestion might become a problem before you reach that limit.

  3. Having more threads than the max number of connections might yield some small benefit, if you make sure to a) obtain a connection from the connection pool, b) read all data in, c) release the connection back to the pool, and THEN d) do the comparing of the data. That's because while one thread is comparing data, another thread can use that connection to do its reading of data. However, comparing data sounds like a fairly simple and quick job to do, so the benefit will not be that great: your thread will be done comparing the data fairly quickly, after which it will want to obtain another connection from the pool, at which point it will be blocked if all connections are in use.

That having been said, I hope you are aware of the fact that there exist tools out there, even free tools, that will do these kinds of comparisons for you. Search for "SQL compare". (I know, it is a misnomer, the tools do not compare SQL, they compare databases, and they happen to use SQL to query the databases that they compare; I did not come up with the name, the creators of these tools did.)

like image 183
Mike Nakis Avatar answered Oct 28 '25 21:10

Mike Nakis



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!