Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hibernate / MySQL Bulk insert problem

I'm having trouble getting Hibernate to perform a bulk insert on MySQL.

I'm using Hibernate 3.3 and MySQL 5.1

At a high level, this is what's happening:

@Transactional
public Set<Long> doUpdate(Project project, IRepository externalSource) {
    List<IEntity> entities = externalSource.loadEntites();
    buildEntities(entities, project);
    persistEntities(project);
}
public void persistEntities(Project project) {
     projectDAO.update(project);
}

This results in n log entries (1 for every row) as follows:

Hibernate: insert into ProjectEntity (name, parent_id, path, project_id, state, type) values (?, ?, ?, ?, ?, ?)

I'd like to see this get batched, so the update is more performant. It's possible that this routine could result in tens-of-thousands of rows generated, and a db trip per row is a killer.

Why isn't this getting batched? (It's my understanding that batch inserts are supposed to be default where appropriate by hibernate).

like image 764
Marty Pitt Avatar asked Feb 20 '10 18:02

Marty Pitt


2 Answers

Pascal's answer is correct. However, because you are using MySQL, I also highly recommend that you try using the rewriteBatchedStatements=true parameter in your JDBC URL.

This parameter causes the JDBC driver to dynamically re-write your INSERT batches to use a single "multi-valued" INSERT, e.g.:

INSERT INTO mytable (mycol) VALUES (0);
INSERT INTO mytable (mycol) VALUES (1);
INSERT INTO mytable (mycol) VALUES (2);

will be re-written to:

INSERT INTO mytable (mycol) VALUES (0), VALUES (1), VALUES (2);

This can make a significant difference in some cases. See http://www.jroller.com/mmatthews/entry/speeding_up_batch_inserts_for for some example measurements.

like image 133
Matt Solnit Avatar answered Oct 25 '22 00:10

Matt Solnit


As documented in the Chapter 13. Batch processing:

If you are undertaking batch processing you will need to enable the use of JDBC batching. This is absolutely essential if you want to achieve optimal performance. Set the JDBC batch size to a reasonable number (10-50, for example):

hibernate.jdbc.batch_size 20

Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.

Don't forget to flush and then clear the session regularly or you'll get OutOfMemoryException as documented in 13.1. Batch inserts.

But IMO, for tens-of-thousands of rows, you should consider using the StatelessSession interface.

like image 32
Pascal Thivent Avatar answered Oct 24 '22 23:10

Pascal Thivent