Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of MySQL Insert statements in Java: Batch mode prepared statements vs single insert with multiple values

Tags:

java

mysql

jdbc

I am designing a MySQL database which needs to handle about 600 row inserts per second across various InnoDB tables. My current implementation uses non-batched prepared statements. However, writing to the MySQL database bottlenecks and my queue size increases over time.

The implementation is written in Java, I don't know the version off hand. It uses MySQL's Java connector. I need to look into switching to JDBC tomorrow. I am assuming these are two different connector packages.

I have read the following threads on the issue:

  • Optimizing MySQL inserts to handle a data stream
  • MyISAM versus InnoDB
  • Inserting Binary data into MySQL (without PreparedStatement's)

and from the mysql site:

  • http://dev.mysql.com/doc/refman/5.0/en/insert-speed.html

My questions are:

  • Does anyone have advice or experience on performance differences using INSERTs with prepared statements in batch mode vs. using a single INSERT statement with multiple VALUEs.

  • What are the performance differences between the MySQL Java connector vs. JDBC. Should I be using one or the other?

  • The tables are for archive purposes, and will see ~90% write to ~10% read (maybe even less). I am using InnoDB. Is this the right choice over MyISAM?

Thank you in advance for your help.

like image 805
Darren Avatar asked Jul 09 '12 05:07

Darren


People also ask

How can increase insert query performance in MySQL?

To optimize insert speed, combine many small operations into a single large operation. Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.

Which is more efficient to insert data into a table in MySQL?

LOAD DATA (all forms) is more efficient than INSERT because it loads rows in bulk.

Does MySQL support batch insert?

Using Bulk Insert Statement in MySQL. The INSERT statement in MySQL also supports the use of VALUES syntax to insert multiple rows as a bulk insert statement. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas.


2 Answers

JDBC is simply a Java SE standard of database access offering the standard interfaces so you're not really bound to a specific JDBC implementation. MySQL Java connector (Connector/J) is an implementation of the JDBC interfaces for MySQL databases only. Out of experience, I'm involved to a project that uses huge amount of data using MySQL, and we mostly prefer MyISAM for data that can be generated: it allows to achieve much higher performance losing transactions, but generally speaking, MyISAM is faster, but InnoDB is more reliable.

I wondered for the performance of the INSERT statements too about a year ago, and found the following old testing code in my code shelf (sorry, it's a bit complex and a bit out of your question scope). The code below contains examples of 4 ways of inserting the test data:

  • single INSERTs;
  • batched INSERTs;
  • manual bulk INSERT (never use it - it's dangerous);
  • and finally prepared bulk INSERT).

It uses TestNG as the runner, and uses some custom code legacy like:

  • the runWithConnection() method - ensures that the connection is closed or put back to the connection pool after the callback is executed (but the code below uses not reliable strategy of the statement closing - even without try/finally to reduce the code);
  • IUnsafeIn<T, E extends Throwable> - a custom callback interface for the methods accepting a single parameter but potentially throwing an exception of type E, like: void handle(T argument) throws E;.
package test;  import test.IUnsafeIn;  import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.SQLException;  import static java.lang.String.format; import static java.lang.String.valueOf; import static java.lang.System.currentTimeMillis;  import core.SqlBaseTest; import org.testng.annotations.AfterSuite; import org.testng.annotations.BeforeSuite; import org.testng.annotations.BeforeTest; import org.testng.annotations.Test;  public final class InsertVsBatchInsertTest extends SqlBaseTest {      private static final int ITERATION_COUNT = 3000;      private static final String CREATE_TABLE_QUERY = "CREATE TABLE IF NOT EXISTS ttt1 (c1 INTEGER, c2 FLOAT, c3 VARCHAR(5)) ENGINE = InnoDB";     private static final String DROP_TABLE_QUERY = "DROP TABLE ttt1";     private static final String CLEAR_TABLE_QUERY = "DELETE FROM ttt1";      private static void withinTimer(String name, Runnable runnable) {         final long start = currentTimeMillis();         runnable.run();         logStdOutF("%20s: %d ms", name, currentTimeMillis() - start);     }      @BeforeSuite     public void createTable() {         runWithConnection(new IUnsafeIn<Connection, SQLException>() {             @Override             public void handle(Connection connection) throws SQLException {                 final PreparedStatement statement = connection.prepareStatement(CREATE_TABLE_QUERY);                 statement.execute();                 statement.close();             }         });     }      @AfterSuite     public void dropTable() {         runWithConnection(new IUnsafeIn<Connection, SQLException>() {             @Override             public void handle(Connection connection) throws SQLException {                 final PreparedStatement statement = connection.prepareStatement(DROP_TABLE_QUERY);                 statement.execute();                 statement.close();             }         });     }      @BeforeTest     public void clearTestTable() {         runWithConnection(new IUnsafeIn<Connection, SQLException>() {             @Override             public void handle(Connection connection) throws SQLException {                 final PreparedStatement statement = connection.prepareStatement(CLEAR_TABLE_QUERY);                 statement.execute();                 statement.close();             }         });     }      @Test     public void run1SingleInserts() {         withinTimer("Single inserts", new Runnable() {             @Override             public void run() {                 runWithConnection(new IUnsafeIn<Connection, SQLException>() {                     @Override                     public void handle(Connection connection) throws SQLException {                         for ( int i = 0; i < ITERATION_COUNT; i++ ) {                             final PreparedStatement statement = connection.prepareStatement("INSERT INTO ttt1 (c1, c2, c3) VALUES (?, ?, ?)");                             statement.setInt(1, i);                             statement.setFloat(2, i);                             statement.setString(3, valueOf(i));                             statement.execute();                             statement.close();                         }                     }                 });             }         });     }      @Test     public void run2BatchInsert() {         withinTimer("Batch insert", new Runnable() {             @Override             public void run() {                 runWithConnection(new IUnsafeIn<Connection, SQLException>() {                     @Override                     public void handle(Connection connection) throws SQLException {                         final PreparedStatement statement = connection.prepareStatement("INSERT INTO ttt1 (c1, c2, c3) VALUES (?, ?, ?)");                         for ( int i = 0; i < ITERATION_COUNT; i++ ) {                             statement.setInt(1, i);                             statement.setFloat(2, i);                             statement.setString(3, valueOf(i));                             statement.addBatch();                         }                         statement.executeBatch();                         statement.close();                     }                 });             }         });     }      @Test     public void run3DirtyBulkInsert() {         withinTimer("Dirty bulk insert", new Runnable() {             @Override             public void run() {                 runWithConnection(new IUnsafeIn<Connection, SQLException>() {                     @Override                     public void handle(Connection connection) throws SQLException {                         final StringBuilder builder = new StringBuilder("INSERT INTO ttt1 (c1, c2, c3) VALUES ");                         for ( int i = 0; i < ITERATION_COUNT; i++ ) {                             if ( i != 0 ) {                                 builder.append(",");                             }                             builder.append(format("(%s, %s, '%s')", i, i, i));                         }                         final String query = builder.toString();                         final PreparedStatement statement = connection.prepareStatement(query);                         statement.execute();                         statement.close();                     }                 });             }         });     }      @Test     public void run4SafeBulkInsert() {         withinTimer("Safe bulk insert", new Runnable() {             @Override             public void run() {                 runWithConnection(new IUnsafeIn<Connection, SQLException>() {                     private String getInsertPlaceholders(int placeholderCount) {                         final StringBuilder builder = new StringBuilder("(");                         for ( int i = 0; i < placeholderCount; i++ ) {                             if ( i != 0 ) {                                 builder.append(",");                             }                             builder.append("?");                         }                         return builder.append(")").toString();                     }                      @SuppressWarnings("AssignmentToForLoopParameter")                     @Override                     public void handle(Connection connection) throws SQLException {                         final int columnCount = 3;                         final StringBuilder builder = new StringBuilder("INSERT INTO ttt1 (c1, c2, c3) VALUES ");                         final String placeholders = getInsertPlaceholders(columnCount);                         for ( int i = 0; i < ITERATION_COUNT; i++ ) {                             if ( i != 0 ) {                                 builder.append(",");                             }                             builder.append(placeholders);                         }                         final int maxParameterIndex = ITERATION_COUNT * columnCount;                         final String query = builder.toString();                         final PreparedStatement statement = connection.prepareStatement(query);                         int valueIndex = 0;                         for ( int parameterIndex = 1; parameterIndex <= maxParameterIndex; valueIndex++ ) {                             statement.setObject(parameterIndex++, valueIndex);                             statement.setObject(parameterIndex++, valueIndex);                             statement.setObject(parameterIndex++, valueIndex);                         }                         statement.execute();                         statement.close();                     }                 });             }         });     }  } 

Take a look at the methods annotated with the @Test annotation: they actually execute the INSERT statements. Also please take a look at the CREATE_TABLE_QUERY constant: in the source code it uses InnoDB producing the following results at my machine with MySQL 5.5 installed (MySQL Connector/J 5.1.12):

InnoDB Single inserts: 74148 ms Batch insert: 84370 ms Dirty bulk insert: 178 ms Safe bulk insert: 118 ms 

If you change the CREATE_TABLE_QUERY InnoDB to MyISAM, you'd see significant performance increase:

MyISAM Single inserts: 604 ms Batch insert: 447 ms Dirty bulk insert: 63 ms Safe bulk insert: 26 ms 

Hope this helps.

UPD:

For the 4th way you must properly customize the max_allowed_packet in mysql.ini (the [mysqld] section) to be large enough to support really big packets.

like image 77
Lyubomyr Shaydariv Avatar answered Sep 21 '22 03:09

Lyubomyr Shaydariv


I know this thread is pretty old, but I just thought I would mention that if you add "rewriteBatchedStatements=true" to the jdbc url when using mysql, it can result in huge performance gains when using batched statements.

like image 23
Jordan L Avatar answered Sep 18 '22 03:09

Jordan L