Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using StatelessSession for Batch processing

From documentation

If we have a case where we need to insert 1000 000 rows/objects:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

Why we should use that approach? What kind of benefit it brings us comparing to StatelessSession one:

    StatelessSession session = sessionFactory.openStatelessSession();
    Transaction tx = session.beginTransaction();

    for ( int i=0; i<100000; i++ ) {
      Customer customer = new Customer(.....);
      session.insert(customer);
    }    

    tx.commit();
    session.close();

I mean, this ("alternative") last example does not use memory, no need to synchronize, clean out of the cache, then this supposed to be best practice for cases like this? Why to use previous one then?

like image 666
ses Avatar asked Jan 05 '13 17:01

ses


People also ask

How does Hibernate batch processing work?

Hibernate in Practice - The Complete Course save(employee); } tx. commit(); session. close(); By default, Hibernate will cache all the persisted objects in the session-level cache and ultimately your application would fall over with an OutOfMemoryException somewhere around the 50,000th row.

What is hibernate batch size?

batch_size , the Hibernate documentation recommends a value of between 5 and 30 but this value depends upon the application's needs. The Hibernate documentation's recommendation is suitable for most OLTP-like applications.


1 Answers

From the documentation you link to:

In particular, a stateless session does not implement a first-level cache nor interact with any second-level or query cache. It does not implement transactional write-behind or automatic dirty checking. Operations performed using a stateless session never cascade to associated instances. Collections are ignored by a stateless session. Operations performed via a stateless session bypass Hibernate's event model and interceptors. Due to the lack of a first-level cache, Stateless sessions are vulnerable to data aliasing effects.

Those are some significant limitations!

If the objects you're creating, or the modifications you're making, are simple changes to scalar fields of individual objects, then i think that a stateless session would have no disadvantages compared to a batched normal session. However, as soon as you want to do something a bit more complex - manipulate a collection-valued property of an object, or another object which is cascaded from the first, say - then the stateless session is more a hindrance than a help.

More generally, if the batched ordinary session gives performance that is good enough, then the stateless session is simply unnecessary complexity. It looks vaguely like the ordinary session, but it has a different API and different semantics, which is the sort of thing that invites bugs.

There can certainly be cases where it is the appropriate tool, but i think these are the exception rather than the rule.

like image 200
Tom Anderson Avatar answered Nov 16 '22 03:11

Tom Anderson