Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Massive insert with JPA + Hibernate

I need to do a massive insert using EJB 3, Hibernate, Spring Data and Oracle. Originally, I am using Spring Data and code is below:

talaoAITDAO.save(taloes);

Where talaoAITDAO is a Spring Data JpaRepository subclass and taloes is a Collection of TalaoAIT entity. In this entity, Its respective ID has this form:

@Id
@Column(name = "ID_TALAO_AIT")
@SequenceGenerator(name = "SQ_TALAO_AIT", sequenceName = "SQ_TALAO_AIT", allocationSize = 1000)
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "SQ_TALAO_AIT")
private Long id;

Also this entity has no related entities to do cascade insert.

My problem here, is that all entities are individually inserted (such as INSERT INTO TABLE(col1, col2) VALUES (val1, val2)). Occasionally, it can cause a timeout and all insertions will be rolled back. I would want convert these individual inserts in batch inserts (such as INSERT INTO TABLE(col1, col2) VALUES (val11, val12), (val21, val22), (val31, val32), ...).

Studying alternatives to improve performance, I found this page in hibernate documentation, beyond Hibernate batch size confusion and this other page. Based on them, I wrote this code:

Session session = super.getEntityManager().unwrap(Session.class);
int batchSize = 1000;
for (int i = 0; i < taloes.size(); i++) {
    TalaoAIT talaoAIT = taloes.get(i);
    session.save(talaoAIT);
    if(i % batchSize == 0) {
        session.flush();
        session.clear();
    }
    taloes.add(talaoAIT);
}
session.flush();
session.clear();

Also, in peristence.xml, I added these properties:

<property name="hibernate.jdbc.batch_size" value="1000" />
<property name="order_inserts" value="true" />

However, although in my tests I had perceived a subtle difference (mainly with big collections and big batch sizes), it was not so big as desirable. In logging console, I saw that Hibernate continued to do individual inserts, not replacing them for massive insert. As in my entity, I am using a Sequence generator I believe that it is not problem (according Hibernate documentation, I would had problem if I was using Identity generator).

So, my question is what can be missing here. Some configuration? Some method not used?

Thanks,

Rafael Afonso.

like image 922
Rafael Afonso Avatar asked Nov 29 '13 12:11

Rafael Afonso


People also ask

How do I save multiple records in JPA?

When you want to save multiple rows in database table then you pass a list of entity objects to JpaRepository or CrudRepository's saveAll() method in order to save multiple entities or objects and this basically happens through Spring Data JPA Batch Insertion configuration.

How Hibernate saves bulk data?

save(employee); } tx. commit(); session. close(); By default, Hibernate will cache all the persisted objects in the session-level cache and ultimately your application would fall over with an OutOfMemoryException somewhere around the 50,000th row.


1 Answers

A couple of things.

First your configuration properties are wrong order_inserts must be hibernate.order_inserts . Currently your setting is ignored and you haven't changed a thing.

Next use the EntityManager instead of doing all that nasty hibernate stuff. The EntityManager also has a flush and clear method. This should at least cleanup your method. Without the order this helps a little to cleanup the session and preventing dirty-checks on all the objects in there.

EntityManager em = getEntityManager();
int batchSize = 1000;
for (int i = 0; i < taloes.size(); i++) {
    TalaoAIT talaoAIT = taloes.get(i);
    em.persist(talaoAIT);
    if(i % batchSize == 0) {
        em.flush();
        em.clear();
    }
    taloes.add(talaoAIT);
}
em.flush();
em.clear();

Next you shouldn't make your batches to large as that can cause memory problems, start with something like 50 and test which/what performs best. There is a point at which dirty-checking is going to take more time then flusing and clearing to the database. You want to find this sweet spot.

like image 67
M. Deinum Avatar answered Oct 02 '22 21:10

M. Deinum