Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to do batch INSERTS with JDBC

In my app I need to do a lot of INSERTS. Its a Java app and I am using plain JDBC to execute the queries. The DB being Oracle. I have enabled batching though, so it saves me network latencies to execute queries. But the queries execute serially as separate INSERTs:

insert into some_table (col1, col2) values (val1, val2) insert into some_table (col1, col2) values (val3, val4) insert into some_table (col1, col2) values (val5, val6) 

I was wondering if the following form of INSERT might be more efficient:

insert into some_table (col1, col2) values (val1, val2), (val3, val4), (val5, val6) 

i.e. collapsing multiple INSERTs into one.

Any other tips for making batch INSERTs faster?

like image 777
Aayush Puri Avatar asked Sep 24 '10 04:09

Aayush Puri


People also ask

Which method is used to perform batch processing in JDBC?

The addBatch() method of Statement, PreparedStatement, and CallableStatement is used to add individual statements to the batch. The executeBatch() is used to start the execution of all the statements grouped together.

Can we perform batch SQL processing in JDBC?

JDBC provides two classes, Statement and PreparedStatement to execute queries on the database. Both classes have their own implementation of the addBatch() and executeBatch() methods which provide us with the batch processing functionality.

How do I insert a batch file?

Batch inserts using PreparedStatement object. Create a PreparedStatement − Create a PreparedStatement object using the prepareStatement() method. Pass the Insert query with place holders “?” instead of values as a parameter to this method.

How does JDBC batch update work?

A JDBC batch update is a batch of updates grouped together, and sent to the database in one batch, rather than sending the updates one by one. Sending a batch of updates to the database in one go, is faster than sending them one by one, waiting for each one to finish.


2 Answers

This is a mix of the two previous answers:

  PreparedStatement ps = c.prepareStatement("INSERT INTO employees VALUES (?, ?)");    ps.setString(1, "John");   ps.setString(2,"Doe");   ps.addBatch();    ps.clearParameters();   ps.setString(1, "Dave");   ps.setString(2,"Smith");   ps.addBatch();    ps.clearParameters();   int[] results = ps.executeBatch(); 
like image 61
Tusc Avatar answered Oct 13 '22 05:10

Tusc


Though the question asks inserting efficiently to Oracle using JDBC, I'm currently playing with DB2 (On IBM mainframe), conceptually inserting would be similar so thought it might be helpful to see my metrics between

  • inserting one record at a time

  • inserting a batch of records (very efficient)

Here go the metrics

1) Inserting one record at a time

public void writeWithCompileQuery(int records) {     PreparedStatement statement;      try {         Connection connection = getDatabaseConnection();         connection.setAutoCommit(true);          String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +                 " VALUES" + "(?, ?, ?, ?, ?)";         statement = connection.prepareStatement(compiledQuery);          long start = System.currentTimeMillis();          for(int index = 1; index < records; index++) {             statement.setInt(1, index);             statement.setString(2, "emp number-"+index);             statement.setInt(3, index);             statement.setInt(4, index);             statement.setString(5, "username");              long startInternal = System.currentTimeMillis();             statement.executeUpdate();             System.out.println("each transaction time taken = " + (System.currentTimeMillis() - startInternal) + " ms");         }          long end = System.currentTimeMillis();         System.out.println("total time taken = " + (end - start) + " ms");         System.out.println("avg total time taken = " + (end - start)/ records + " ms");          statement.close();         connection.close();      } catch (SQLException ex) {         System.err.println("SQLException information");         while (ex != null) {             System.err.println("Error msg: " + ex.getMessage());             ex = ex.getNextException();         }     } } 

The metrics for 100 transactions :

each transaction time taken = 123 ms each transaction time taken = 53 ms each transaction time taken = 48 ms each transaction time taken = 48 ms each transaction time taken = 49 ms each transaction time taken = 49 ms ... .. . each transaction time taken = 49 ms each transaction time taken = 49 ms total time taken = 4935 ms avg total time taken = 49 ms 

The first transaction is taking around 120-150ms which is for the query parse and then execution, the subsequent transactions are only taking around 50ms. (Which is still high, but my database is on a different server(I need to troubleshoot the network))

2) With insertion in a batch (efficient one) - achieved by preparedStatement.executeBatch()

public int[] writeInABatchWithCompiledQuery(int records) {     PreparedStatement preparedStatement;      try {         Connection connection = getDatabaseConnection();         connection.setAutoCommit(true);          String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +                 " VALUES" + "(?, ?, ?, ?, ?)";         preparedStatement = connection.prepareStatement(compiledQuery);          for(int index = 1; index <= records; index++) {             preparedStatement.setInt(1, index);             preparedStatement.setString(2, "empo number-"+index);             preparedStatement.setInt(3, index+100);             preparedStatement.setInt(4, index+200);             preparedStatement.setString(5, "usernames");             preparedStatement.addBatch();         }          long start = System.currentTimeMillis();         int[] inserted = preparedStatement.executeBatch();         long end = System.currentTimeMillis();          System.out.println("total time taken to insert the batch = " + (end - start) + " ms");         System.out.println("total time taken = " + (end - start)/records + " s");          preparedStatement.close();         connection.close();          return inserted;      } catch (SQLException ex) {         System.err.println("SQLException information");         while (ex != null) {             System.err.println("Error msg: " + ex.getMessage());             ex = ex.getNextException();         }         throw new RuntimeException("Error");     } } 

The metrics for a batch of 100 transactions is

total time taken to insert the batch = 127 ms 

and for 1000 transactions

total time taken to insert the batch = 341 ms 

So, making 100 transactions in ~5000ms (with one trxn at a time) is decreased to ~150ms (with a batch of 100 records).

NOTE - Ignore my network which is super slow, but the metrics values would be relative.

like image 34
prayagupa Avatar answered Oct 13 '22 04:10

prayagupa