I am using MySQL and Java to SELECT about 50000 records. The strange thing is that when I use ResultSet and next() method to read the data, I see that the RAM usage of my java application increases during the fetching. it begins with 255 MB and increases up to 379 MB! the code I am using is here:
try {
Class.forName("com.mysql.jdbc.Driver");
Connection conn = DriverManager.getConnection("jdbc:mysql://localhost/#mysql50#crawler - used in report?" + "user=root&password=&useUnicode=true&characterEncoding=UTF-8");
Statement st = conn.createStatement();
ResultSet rsDBReader = st.executeQuery("SELECT Id, Content FROM DocsArchive");
while (rsDBReader.next()) {
int docId = rsDBReader.getInt(1);
String content = rsDBReader.getString(2);
. . .
}
rsDBReader.close();
st.close();
conn.close();
} catch (Exception e) {
System.out.println("Exception in reading data: " + e);
}
I am sure that the memory usage is for ResultSet, not other parts of the program. In this program I don't need to update records, so I want to remove every record after finishing the work. My guess is that the records which have been read, will not be removed and the program doesn't free their memory. so I have used some tricks to avoid this, such as using following code:
Statement st = conn.createStatement( ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY, ResultSet.CLOSE_CURSORS_AT_COMMIT);
st.setFetchSize(500);
rsDBReader.setFetchSize(500);
but they didn't change any thing. :(
So I need some method that removes (releases) memory of rows that have been read.
Another interesting point is that even after finishing the function and closing the ResultSet, Statement and Connection, and going to other part of the program, still the program memory usage doesn't decrease! Thanks
Memory usage will increase if multiple buffers of the same type, such as join_buffer_size or sort_buffer_size, are allocated during a JOIN or SORT operation. For example, MySQL allocates one JOIN buffer to perform JOIN between two tables.
The default configuration is designed to permit a MySQL server to start on a virtual machine that has approximately 512MB of RAM. You can improve MySQL performance by increasing the values of certain cache and buffer-related system variables.
Use Statement.setFetchSize() to provide a hint to the driver that it should stream the ResultSet
for ones containing a certain number of rows. As far as I know, the MySQL Connector-J driver does understand the hint and streams ResultSet
s (but this is restricted to a row at a time in the case of MySQL).
The default value being 0, will ensure that the Connector-J driver will fetch the complete ResultSet
without streaming it. That's why you will need to provide an explicit value - Integer.MIN_VALUE in the case of MySQL.
The statement:
Statement st = conn.createStatement( ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY, ResultSet.CLOSE_CURSORS_AT_COMMIT);
does not result in streaming the ResultSet
(at least not on it's own accord). It merely ensures that the resultset is not "scrollable" (i.e. can be traversed only in the forward direction) and not "updatable" and the underlying cursor will be closed when the transaction commits.
As noted in the JDBC implementation notes of MySQL, the above statement (without the ResultSet.CLOSE_CURSORS_AT_COMMIT
parameter) has to be invoked in conjunction with the Statement.setFetchSize(Integer.MIN_VALUE)
invocation for the streaming to occur row by row. The associated caveats involved in such a scenario have been documented as well.
Note that, the holdability of the cursor is not specified in the example mentioned in the MySQL documentation. If you need a value different from the the one provided by Connection.getHoldability()
, then again, this advice might not apply.
I'd suggest limiting the amount of rows you retrieve in your query. 50000 is a lot, so why not have a loop that fetches, let's say, 1000 rows every time?
You can achieve this using the limit
statement, as described here. It's always best to be pragmatic about amount of data you're processing. Your current select might return 50000 rows today, but what if it grows to one million tomorrow? Your application will choke. So, do your processing step by step.
Note that there is similar issue with the latest releases of Postgres. In order to achieve cursor processing* you need to disable auto commit on connection connection.setAutoCommit(false)
and use single statement in your SQL statement (i.e. statement which contains just one semicolon). It worked for me.
Postgres JDBC documentation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With