Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling very large amount of data in MyBatis

My goal is actually to dump all the data of a database to an XML file. The database is not terribly big, it's about 300MB. The problem is that I have a memory limitation of 256MB (in JVM) only. So obviously I cannot just read everything into memory.

I managed to solve this problem using iBatis (yes I mean iBatis, not myBatis) by calling it's getList(... int skip, int max) multiple times, with incremented skip. That does solve my memory problem, but I'm not impressed with the speed. The variable names suggests that what the method does under the hood is to read the entire result-set skip then specified record. This sounds quite redundant to me (I'm not saying that's what the method is doing, I'm just guessing base on the variable name).

Now, I switched to myBatis 3 for the next version of my application. My question is: is there any better way to handle large amount of data chunk by chunk in myBatis? Is there anyway to make myBatis process first N records, return them to the caller while keeping the result set connection open so the next time the user calls the getList(...) it will start reading from the N+1 record without doing any "skipping"?

like image 421
Alvin Avatar asked Jul 01 '11 09:07

Alvin


2 Answers

myBatis CAN stream results. What you need is a custom result handler. With this you can take each row separately and write it to your XML file. The overall scheme looks like this:

session.select(
    "mappedStatementThatFindsYourObjects",
    parametersForStatement,
    resultHandler);

Where resultHandler is an instance of a class implementing the ResultHandler interface. This interface has just one method handleResult. This method provides you with a ResultContext object. From this context you can retrieve the row currently being read and do something with it.

handleResult(ResultContext context) {
  Object result = context.getResultObject();
  doSomething(result);
}
like image 150
Stefan Oehme Avatar answered Oct 17 '22 17:10

Stefan Oehme


No, mybatis does not have full capability to stream results yet.

EDIT 1: If you don't need nested result mappings then you could implement a custom result handler to stream results. on current released versions of MyBatis. (3.1.1) The current limitation is when you need to do complex result mapping. The NestedResultSetHandler does not allow custom result handlers. A fix is available, and it looks like is currently targeted for 3.2. See Issue 577.

In summary, to stream large result sets using MyBatis you'll need.

  1. Implement your own ResultSetHandler.
  2. Increase fetch size. (as noted below by Guillaume Perrot)
  3. For Nested result maps, use the fix discussed on Issue 577. This fix also resolves some memory issues with large result sets.
like image 38
Andy Avatar answered Oct 17 '22 18:10

Andy