Can some one explain me how
hibernate.jdbc.batch_size = 1000 
and
if (i % 100 == 0 && i > 0) {
    session.flush();
    session.clear();
}
together works?
Hibernate property hibernate.jdbc.batch_size is a way for hibernate to optimize your insert or update statetment whereas flushing loop is about memory exhaustion.
Without batchsize when you try to save an entity hibernate fire 1 insert statement, thus if you work with a big collection, for each save hibernate fire 1 statement.
Imagine the following chunk of code :
for (Entity e : entities) {
    session.save(e);
}
Here hibernate will fire 1 insert statement per entity in your collection. if you have 100 elements in your collection so 100 insert statements will be fire. This approach is not very efficient for 2 main reasons:
OutOfMemoryException.hibernate.jdbc.batch_size and the flushing loop have 2 differents purposes but are complementary.
Hibernate use the first to control how many entities will be in batch. Under the cover Hibernate use java.sql.Statement.addBatch(...) and executeBatch() methods.
So hibernate.jdbc.batch_size tells hibernate how many times it have to call addBatch() before calling executeBatch().
So setting this property doesn't prevent you of memory exhaution.
In order to take care of the memory you have to flush your session on a regular basis and this is the purpose of flushing loop.
When you write:
for (Entity e : entities) {
    if (i % 100 == 0 && i > 0) {
        session.flush();
        session.clear();
    }
}
you're telling hibernate to flush and clear the session every 100 entities (you release memory).
So now what is the link between the 2 ?
In order to be optimal you have to define your jdbc.batch_size and your flushing param identical.
if you define a flush param lower that the batch_size you choose so hibernate will flush the session more frequently so it will create small batch until it arrive to batch size, which is not efficient.
when the 2 are the same hibernate will only execute batches of optimal size except for the last one if size of collection is not a multiple of your batch_size.
You can see the following post for more details about this last point.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With