Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Safely clearing Hibernate session in the middle of large transaction

I am using Spring+Hibernate for an operation which requires creating and updating literally hundreds of thousands of items. Something like this:

{
   ...
   Foo foo = fooDAO.get(...);
   for (int i=0; i<500000; i++) {
      Bar bar = barDAO.load(i);
      if (bar.needsModification() && foo.foo()) {
         bar.setWhatever("new whatever");
         barDAO.update(bar);
         // commit here
         Baz baz = new Baz();
         bazDAO.create(baz);
         // if (i % 100 == 0), clear
      }
   }
}

To protect myself against losing changes in the middle, I commit the changes immediately after barDAO.update(bar):

HibernateTransactionManager transactionManager = ...; // injected by Spring
DefaultTransactionDefinition def = new DefaultTransactionDefinition();
def.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRED);
TransactionStatus transactionStatus = transactionManager.getTransaction(def);
transactionManager.commit(transactionStatus);

At this point I have to say that entire process is running in a transaction wrapped into org.springframework.orm.hibernate3.support.ExtendedOpenSessionInViewFilter (yes, this is a webapp).

This all works fine with one exception: after few thousand of updates/commits, entire process gets really slow, most likely due to memory being bloated by ever-increasing amount of objects kept by Spring/Hibernate.

In Hibernate-only environment this would be easily solvable by calling org.hibernate.Session#clear().

Now, the questions:

  • When is it a good time to clear()? Does it have big performance cost?
  • Why aren't objects like bar or baz released/GCd automatically? What's the point of keeping them in the session after the commit (in the next loop of iteration they're not reachable anyway)? I haven't done memory dump to prove this but my good feeling is that they're still there until completely exited. If the answer to this is "Hibernate cache", then why isn't the cache flushed upon the available memory going low?
  • is it safe/recommended to call org.hibernate.Session#clear() directly (having in mind entire Spring context, things like lazy loading, etc.)? Are there any usable Spring wrappers/counterparts for achieving the same?
  • If answer to the above question is true, what will happen with object foo, assuming clear() is called inside the loop? What if foo.foo() is a lazy-load method?

Thank you for the answers.

like image 811
mindas Avatar asked Sep 24 '10 14:09

mindas


People also ask

When should I close Hibernate Session?

It all depends on how you obtain the session. if you use sessionFactory. getCurrentSession() , you'll obtain a "current session" which is bound to the lifecycle of the transaction and will be automatically flushed and closed when the transaction ends (commit or rollback). if you decide to use sessionFactory.

What happens if Hibernate Session is not closed?

When you don't close your Hibernate sessions and therefore do not release JDBC connections, you have what is typically called Connection leak. So, after a number of requests (depending on the size of your connection pool) the server will not be able to acquire a connection to respond your request.

Can Hibernate Session span many transactions?

Obviously, you can. A hibernate session is more or less a database connection and a cache for database objects. And you can have multiple successive transactions in a single database connection. More, when you use a connection pool, the connection is not closed but is recycled.

What is Hibernate Session clear?

sesion. flush(); is used in flushing the session forces Hibernate to synchronize the in-memory state of the Session with the database.


2 Answers

When is it a good time to clear()? Does it have big performance cost?

At regular intervals, ideally the same as the JDBC batch size, after having flushed the changes. The documentation describes common idioms in the chapter about Batch processing:

13.1. Batch inserts

When making new objects persistent flush() and then clear() the session regularly in order to control the size of the first-level cache.

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

And this shouldn't have a performance cost, au contraire:

  • it allows to keep the number of objects to track for dirtiness low (so flushing should be fast),
  • it should allow to reclaim memory.

Why aren't objects like bar or baz released/GCd automatically? What's the point of keeping them in the session after the commit (in the next loop of iteration they're not reachable anyway)?

You need to clear() the session explicitly if you don't want to keep entities tracked, that's all, that's how it works (one might want to commit a transaction without "loosing" the entities).

But from what I can see, bar and baz instances should become candidate to GC after the clear. It would be interesting to analyze a memory dump to see what is happening exactly.

is it safe/recommended to call org.hibernate.Session#clear() directly

As long as you flush() the pending changes to not loose them (unless this is what you want), I don't see any problem with that (your current code will loose a create every 100 loop but maybe it's just some pseudo code).

If answer to the above question is true, what will happen with object foo, assuming clear() is called inside the loop? What if foo.foo() is a lazy-load method?

Calling clear() evicts all loaded instances from the Session, making them detached entities. If a subsequent invocation requires an entity to be "attached", it will fail.

like image 153
Pascal Thivent Avatar answered Oct 18 '22 01:10

Pascal Thivent


I just wanted to point out that, after clearing the session, if you want to continue to use some objects that were in the session, you will have to Session.refresh(obj) them in order to continue.

Otherwise you will get following error:

org.hibernate.NonUniqueObjectException
like image 2
smdb21 Avatar answered Oct 18 '22 00:10

smdb21