Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

hibernate performance issue, persist one by one or mass?

I have a text file ~6GB which I need to parse and later persist. By 'parsing' I'm reading a line from the file (usually 2000 chars), create a Car-object from the line and later I persist it.

I'm using a producer consumer pattern to parse and persist and wonder if it makes any difference (for performance reasons) to persist one object at a time or 1000 (or any other amount) in one commit?

At the moment, it takes me >2hr to persist everything (3 million lines) and it looks too much time for me (or I may be wrong).

Currently I'm doing this:

public void persistCar(Car car) throws Exception
{
    try
    {
        carDAO.beginTransaction();  //get hibernate session...

        //do all save here.

        carDAO.commitTransaction(); // commit the session

    }catch(Exception e)
    {
        carDAO.rollback();
        e.printStackTrace(); 
    }
    finally
    {
        carDAO.close();
    }
}

Before I make any design changes I was wondering if there's a reason why this design is better (or not) and if so, what should be the cars.size()? Also, is open/close of session considered expensive?

public void persistCars(List<Car> cars) throws Exception
{
    try
    {
        carDAO.beginTransaction();  //get hibernate session...
        for (Car car : cars)    
        //do all save here.

        carDAO.commitTransaction(); // commit the session

    }catch(Exception e)
    {
        carDAO.rollback();
        e.printStackTrace(); 
    }
    finally
    {
        carDAO.close();
    }
}
like image 548
adhg Avatar asked Apr 23 '12 02:04

adhg


People also ask

Which is better save or persist in Hibernate?

The save method proves to be of less use in a long-running conversation that has extended a given Session context. As the persist method is called outside the transaction boundaries, it is utilized in long-running conversations that offer an extended Session context. Save() method gets support only through Hibernate.

Can Hibernate span only one transaction?

A hibernate session is more or less a database connection and a cache for database objects. And you can have multiple successive transactions in a single database connection.

What if there were a tool that could automatically detect JPA and Hibernate performance issues?

Imagine having a tool that can automatically detect JPA and Hibernate performance issues. Wouldn't that be just awesome? Well, Hypersistence Optimizer is that tool! And it works with Spring Boot, Spring Framework, Jakarta EE, Java EE, Quarkus, or Play Framework.


1 Answers

Traditionally hibernate does not go that well with bulk inserts. There are some ways to optimize it to some level.

Take this example from the API Docs,

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

In the above example the session if flushed after inserting 20 entries which will make the operation little faster.

Here an interesting article discussing the same stuff.

We have successfully implemented an alternative way of bulk inserts using stored procedures. In this case you will pass the parameters to the SP as "|" separated list, and will write the insert scrips inside the SP. Here the code might look a bit complex but is very effective.

like image 60
ManuPK Avatar answered Oct 06 '22 00:10

ManuPK