I am designing a MySQL
database which needs to handle about 600 row inserts per second across various InnoDB tables. My current implementation uses non-batched prepared statements. However, writing to the MySQL
database bottlenecks and my queue size increases over time.
The implementation is written in Java, I don't know the version off hand. It uses MySQL
's Java connector. I need to look into switching to JDBC
tomorrow. I am assuming these are two different connector packages.
I have read the following threads on the issue:
and from the mysql site:
My questions are:
Does anyone have advice or experience on performance differences using INSERTs with prepared statements in batch mode vs. using a single INSERT
statement with multiple VALUEs.
What are the performance differences between the MySQL
Java connector vs. JDBC
. Should I be using one or the other?
The tables are for archive purposes, and will see ~90% write to ~10% read (maybe even less). I am using InnoDB. Is this the right choice over MyISAM?
Thank you in advance for your help.
To optimize insert speed, combine many small operations into a single large operation. Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.
LOAD DATA (all forms) is more efficient than INSERT because it loads rows in bulk.
Using Bulk Insert Statement in MySQL. The INSERT statement in MySQL also supports the use of VALUES syntax to insert multiple rows as a bulk insert statement. To do this, include multiple lists of column values, each enclosed within parentheses and separated by commas.
JDBC is simply a Java SE standard of database access offering the standard interfaces so you're not really bound to a specific JDBC implementation. MySQL Java connector (Connector/J) is an implementation of the JDBC interfaces for MySQL databases only. Out of experience, I'm involved to a project that uses huge amount of data using MySQL, and we mostly prefer MyISAM for data that can be generated: it allows to achieve much higher performance losing transactions, but generally speaking, MyISAM is faster, but InnoDB is more reliable.
I wondered for the performance of the INSERT statements too about a year ago, and found the following old testing code in my code shelf (sorry, it's a bit complex and a bit out of your question scope). The code below contains examples of 4 ways of inserting the test data:
INSERT
s;INSERT
s;INSERT
(never use it - it's dangerous);INSERT
).It uses TestNG as the runner, and uses some custom code legacy like:
runWithConnection()
method - ensures that the connection is closed or put back to the connection pool after the callback is executed (but the code below uses not reliable strategy of the statement closing - even without try
/finally
to reduce the code);IUnsafeIn<T, E extends Throwable>
- a custom callback interface for the methods accepting a single parameter but potentially throwing an exception of type E, like: void handle(T argument) throws E;
.package test; import test.IUnsafeIn; import java.sql.Connection; import java.sql.PreparedStatement; import java.sql.SQLException; import static java.lang.String.format; import static java.lang.String.valueOf; import static java.lang.System.currentTimeMillis; import core.SqlBaseTest; import org.testng.annotations.AfterSuite; import org.testng.annotations.BeforeSuite; import org.testng.annotations.BeforeTest; import org.testng.annotations.Test; public final class InsertVsBatchInsertTest extends SqlBaseTest { private static final int ITERATION_COUNT = 3000; private static final String CREATE_TABLE_QUERY = "CREATE TABLE IF NOT EXISTS ttt1 (c1 INTEGER, c2 FLOAT, c3 VARCHAR(5)) ENGINE = InnoDB"; private static final String DROP_TABLE_QUERY = "DROP TABLE ttt1"; private static final String CLEAR_TABLE_QUERY = "DELETE FROM ttt1"; private static void withinTimer(String name, Runnable runnable) { final long start = currentTimeMillis(); runnable.run(); logStdOutF("%20s: %d ms", name, currentTimeMillis() - start); } @BeforeSuite public void createTable() { runWithConnection(new IUnsafeIn<Connection, SQLException>() { @Override public void handle(Connection connection) throws SQLException { final PreparedStatement statement = connection.prepareStatement(CREATE_TABLE_QUERY); statement.execute(); statement.close(); } }); } @AfterSuite public void dropTable() { runWithConnection(new IUnsafeIn<Connection, SQLException>() { @Override public void handle(Connection connection) throws SQLException { final PreparedStatement statement = connection.prepareStatement(DROP_TABLE_QUERY); statement.execute(); statement.close(); } }); } @BeforeTest public void clearTestTable() { runWithConnection(new IUnsafeIn<Connection, SQLException>() { @Override public void handle(Connection connection) throws SQLException { final PreparedStatement statement = connection.prepareStatement(CLEAR_TABLE_QUERY); statement.execute(); statement.close(); } }); } @Test public void run1SingleInserts() { withinTimer("Single inserts", new Runnable() { @Override public void run() { runWithConnection(new IUnsafeIn<Connection, SQLException>() { @Override public void handle(Connection connection) throws SQLException { for ( int i = 0; i < ITERATION_COUNT; i++ ) { final PreparedStatement statement = connection.prepareStatement("INSERT INTO ttt1 (c1, c2, c3) VALUES (?, ?, ?)"); statement.setInt(1, i); statement.setFloat(2, i); statement.setString(3, valueOf(i)); statement.execute(); statement.close(); } } }); } }); } @Test public void run2BatchInsert() { withinTimer("Batch insert", new Runnable() { @Override public void run() { runWithConnection(new IUnsafeIn<Connection, SQLException>() { @Override public void handle(Connection connection) throws SQLException { final PreparedStatement statement = connection.prepareStatement("INSERT INTO ttt1 (c1, c2, c3) VALUES (?, ?, ?)"); for ( int i = 0; i < ITERATION_COUNT; i++ ) { statement.setInt(1, i); statement.setFloat(2, i); statement.setString(3, valueOf(i)); statement.addBatch(); } statement.executeBatch(); statement.close(); } }); } }); } @Test public void run3DirtyBulkInsert() { withinTimer("Dirty bulk insert", new Runnable() { @Override public void run() { runWithConnection(new IUnsafeIn<Connection, SQLException>() { @Override public void handle(Connection connection) throws SQLException { final StringBuilder builder = new StringBuilder("INSERT INTO ttt1 (c1, c2, c3) VALUES "); for ( int i = 0; i < ITERATION_COUNT; i++ ) { if ( i != 0 ) { builder.append(","); } builder.append(format("(%s, %s, '%s')", i, i, i)); } final String query = builder.toString(); final PreparedStatement statement = connection.prepareStatement(query); statement.execute(); statement.close(); } }); } }); } @Test public void run4SafeBulkInsert() { withinTimer("Safe bulk insert", new Runnable() { @Override public void run() { runWithConnection(new IUnsafeIn<Connection, SQLException>() { private String getInsertPlaceholders(int placeholderCount) { final StringBuilder builder = new StringBuilder("("); for ( int i = 0; i < placeholderCount; i++ ) { if ( i != 0 ) { builder.append(","); } builder.append("?"); } return builder.append(")").toString(); } @SuppressWarnings("AssignmentToForLoopParameter") @Override public void handle(Connection connection) throws SQLException { final int columnCount = 3; final StringBuilder builder = new StringBuilder("INSERT INTO ttt1 (c1, c2, c3) VALUES "); final String placeholders = getInsertPlaceholders(columnCount); for ( int i = 0; i < ITERATION_COUNT; i++ ) { if ( i != 0 ) { builder.append(","); } builder.append(placeholders); } final int maxParameterIndex = ITERATION_COUNT * columnCount; final String query = builder.toString(); final PreparedStatement statement = connection.prepareStatement(query); int valueIndex = 0; for ( int parameterIndex = 1; parameterIndex <= maxParameterIndex; valueIndex++ ) { statement.setObject(parameterIndex++, valueIndex); statement.setObject(parameterIndex++, valueIndex); statement.setObject(parameterIndex++, valueIndex); } statement.execute(); statement.close(); } }); } }); } }
Take a look at the methods annotated with the @Test annotation: they actually execute the INSERT
statements. Also please take a look at the CREATE_TABLE_QUERY
constant: in the source code it uses InnoDB producing the following results at my machine with MySQL 5.5 installed (MySQL Connector/J 5.1.12):
InnoDB Single inserts: 74148 ms Batch insert: 84370 ms Dirty bulk insert: 178 ms Safe bulk insert: 118 ms
If you change the CREATE_TABLE_QUERY
InnoDB to MyISAM, you'd see significant performance increase:
MyISAM Single inserts: 604 ms Batch insert: 447 ms Dirty bulk insert: 63 ms Safe bulk insert: 26 ms
Hope this helps.
UPD:
For the 4th way you must properly customize the max_allowed_packet
in mysql.ini
(the [mysqld]
section) to be large enough to support really big packets.
I know this thread is pretty old, but I just thought I would mention that if you add "rewriteBatchedStatements=true" to the jdbc url when using mysql, it can result in huge performance gains when using batched statements.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With