Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Streaming large result sets with MySQL

I'm developing a spring application that uses large MySQL tables. When loading large tables, I get an OutOfMemoryException, since the driver tries to load the entire table into application memory.

I tried using

statement.setFetchSize(Integer.MIN_VALUE); 

but then every ResultSet I open hangs on close(); looking online I found that that happens because it tries loading any unread rows before closing the ResultSet, but that is not the case since I do this:

ResultSet existingRecords = getTableData(tablename); try {     while (existingRecords.next()) {         // ...     } } finally {     existingRecords.close(); // this line is hanging, and there was no exception in the try clause } 

The hangs happen for small tables (3 rows) as well, and if I don't close the RecordSet (which happened in one method) then connection.close() hangs.


Stack trace of the hang:

SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int) line: not available [native method]
SocketInputStream.read(byte[], int, int) line: 129
ReadAheadInputStream.fill(int) line: 113
ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(byte[], int, int) line: 160
ReadAheadInputStream.read(byte[], int, int) line: 188
MysqlIO.readFully(InputStream, byte[], int, int) line: 2428 MysqlIO.reuseAndReadPacket(Buffer, int) line: 2882
MysqlIO.reuseAndReadPacket(Buffer) line: 2871
MysqlIO.checkErrorPacket(int) line: 3414
MysqlIO.checkErrorPacket() line: 910
MysqlIO.nextRow(Field[], int, boolean, int, boolean, boolean, boolean, Buffer) line: 1405
RowDataDynamic.nextRecord() line: 413
RowDataDynamic.next() line: 392 RowDataDynamic.close() line: 170
JDBC4ResultSet(ResultSetImpl).realClose(boolean) line: 7473 JDBC4ResultSet(ResultSetImpl).close() line: 881 DelegatingResultSet.close() line: 152
DelegatingResultSet.close() line: 152
DelegatingPreparedStatement(DelegatingStatement).close() line: 163
(This is my class) Database.close() line: 84

like image 671
configurator Avatar asked Mar 15 '10 13:03

configurator


People also ask

Does MySQL support streaming?

In order to enable streaming when using MySQL, you either need to set the JDBC fetch size to Integer. MIN_VALUE or use a positive integer value as long as you also set the useCursorFetch connection property to true .

What is MySQL stream?

MySQL streaming replication works on the basis of binary logs. These are files that contain a log of all the activities that happen in a MySQL instance. Setting up MySQL streaming replication involves configuring the source database to enable binary logs and then configuring the slave to use it as the source of data.


2 Answers

Only setting the fetch size is not the correct approach. The javadoc of Statement#setFetchSize() already states the following:

Gives the JDBC driver a hint as to the number of rows that should be fetched from the database

The driver is actually free to apply or ignore the hint. Some drivers ignore it, some drivers apply it directly, some drivers need more parameters. The MySQL JDBC driver falls in the last category. If you check the MySQL JDBC driver documentation, you'll see the following information (scroll about 2/3 down until header ResultSet):

To enable this functionality, you need to create a Statement instance in the following manner:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY, java.sql.ResultSet.CONCUR_READ_ONLY); stmt.setFetchSize(Integer.MIN_VALUE); 

Please read the entire section of the document, it describes the caveats of this approach as well. Here's a relevant cite:

There are some caveats with this approach. You will have to read all of the rows in the result set (or close it) before you can issue any other queries on the connection, or an exception will be thrown.

(...)

If the statement is within scope of a transaction, then locks are released when the transaction completes (which implies that the statement needs to complete first). As with most other databases, statements are not complete until all the results pending on the statement are read or the active result set for the statement is closed.

If that doesn't fix the OutOfMemoryError (not Exception), then the problem is likely that you're storing all the data in Java's memory instead of processing it immediately as soon as the data comes in. This would require more changes in your code, maybe a complete rewrite. I've answered similar question before here.

like image 62
BalusC Avatar answered Oct 14 '22 14:10

BalusC


Don't close your ResultSets twice.

Apparently, when closing a Statement it attempts to close the corresponding ResultSet, as you can see in these two lines from the stack trace:

DelegatingResultSet.close() line: 152
DelegatingPreparedStatement(DelegatingStatement).close() line: 163

I had thought the hang was in ResultSet.close() but it was actually in Statement.close() which calls ResultSet.close(). Since the ResultSet was already closed, it just hung.

We've replaced all ResultSet.close() with results.getStatement().close() and removed all Statement.close()s, and the problem is now solved.

like image 32
configurator Avatar answered Oct 14 '22 16:10

configurator