Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concurrent use of same JDBC connection by multiple threads

I'm trying to better understand what will happen if multiple threads try to execute different sql queries, using the same JDBC connection, concurrently.

  • Will the outcome be functionally correct?

  • What are the performance implications?

  • Will thread A have to wait for thread B to be completely done with its query?

  • Or will thread A be able to send its query immediately after thread B has sent its query, after which the database will execute both queries in parallel?


I see that the Apache DBCP uses synchronization protocols to ensure that connections obtained from the pool are removed from the pool, and made unavailable, until they are closed. This seems more inconvenient than it needs to be. I'm thinking of building my own "pool" simply by creating a static list of open connections, and distributing them in a round-robin manner.

I don't mind the occasional performance degradation, and the convenience of not having to close the connection after every use seems very appealing. Is there any downside to me doing this?

like image 693
RvPr Avatar asked Oct 03 '15 20:10

RvPr


People also ask

Can multiple threads use the same DB connection?

No. Of course not. Each thread needs its own connection. "Allocation of one database connection per thread will cause a large overhead" probably a lot less overhead than you think, and not as much troubles as trying to share a connection concurrently.

What will happen if multiple threads accessing the same resource?

Multiple threads accessing shared data simultaneously may lead to a timing dependent error known as data race condition. Data races may be hidden in the code without interfering or harming the program execution until the moment when threads are scheduled in a scenario (the condition) that break the program execution.

Are JDBC connections thread-safe?

The PostgreSQL™ JDBC driver is thread safe. Consequently, if your application uses multiple threads then you do not have to worry about complex algorithms to ensure that only one thread uses the database at a time.

Is Oracle JDBC connection thread-safe?

Oracle states the thread safety of its Connection implementation in its official document: The Oracle JDBC drivers provide full support for, and are highly optimized for, applications that use Java multithreading … However, Oracle strongly discourages sharing a database connection among multiple threads.

Can two threads use the same JDBC connection object?

If you choose to share the connection, then the same JDBC connection object will be used by all threads (each thread will have its own statement object, however). Because all Oracle JDBC API methods are synchronized, if two threads try to use the connection object simultaneously, then one will be forced to wait until the other one finishes its use.

Do the Oracle JDBC drivers support multithreading?

The Oracle JDBC drivers provide full support for programs that use Java multithreading. The following example creates a specified number of threads and lets you determine whether or not the threads will share a connection.

What is the use of JDBC in Java?

JDBC allows you to share a single Connection among multiple threads. Working with multiple threads sharing a single connection Working with multiple threads sharing a single connection JDBC allows you to share a single Connectionamong multiple threads.

Why am I sharing a connection among multiple threads?

You may be sharing a Connection among multiple threads because you have experienced poor concurrency using separate transactions. Example of threads sharing a statement This example shows what can happen if two threads try to share a single Statement. Parent topic:Controlling Derby application behavior Related concepts


1 Answers

I ran the following set of tests using a AWS RDS Postgres database, and Java 11:

  1. Create a table with 11M rows, each row containing a single TEXT column, populated with a random 100-char string

  2. Pick a random 5 character string, and search for partial-matches of this string, in the above table

  3. Time how long the above query takes to return results. In my case, it takes ~23 seconds. Because there are very few results returned, we can conclude that the majority of this 23 seconds is spent waiting for the DB to run the full-table-scan, and not in sending the request/response packets

  4. Run multiple queries in parallel (with different keywords), using different connections. In my case, I see that they all complete in ~23 seconds. Ie, the queries are being efficiently parallelized

  5. Run multiple queries on parallel threads, using the same connection. I now see that the first result comes back in ~23 seconds. The second result comes back in ~46 seconds. The third in ~1 minute. etc etc. All the results are functionally correct, in that they match the specific keyword queried by that thread

To add on to what Joni mentioned earlier, his conclusion matches the behavior I'm seeing on Postgres as well. It appears that all "correctness" is preserved, but all parallelism benefits are lost, if multiple queries are sent on the same connection at the same time.

like image 54
RvPr Avatar answered Sep 22 '22 12:09

RvPr