Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hibernate Relationship Mapping/Speed up batch inserts

I have 5 MySQL InnoDB tables: Test,InputInvoice,InputLine,OutputInvoice,OutputLine and each is mapped and functioning in Hibernate. I have played with using StatelessSession/Session, and JDBC batch size. I have removed any generator classes to let MySQL handle the id generation- but it is still performing quite slow. Each of those tables is represented in a java class, and mapped in hibernate accordingly. Currently when it comes time to write the data out, I loop through the objects and do a session.save(Object) or session.insert(Object) if I'm using StatelessSession. I also do a flush and clear (when using Session) when my line count reaches the max jdbc batch size (50).

  1. Would it be faster if I had these in a 'parent' class that held the objects and did a session.save(master) instead of each one?
  2. If I had them in a master/container class, how would I map that in hibernate to reflect the relationship? The container class wouldn't actually be a table of it's own, but a relationship all based on two indexes run_id (int) and line (int).
  3. Another direction would be: How do I get Hibernate to do a multi-row insert?
like image 796
ashurexm Avatar asked Apr 22 '10 20:04

ashurexm


2 Answers

The ID generation strategy is critical for batch insertion in Hibernate. In particular, IDENTITY generation will usually not work (note that AUTO typically maps to IDENTITY as well). This is because during batch insert Hibernate has a flag called "requiresImmediateIdAccess" that says whether or not generated IDs are immediately required or not; if so, batch processing is disabled.

You can easily spot this in the DEBUG-level logs when it says "executing identity-insert immediately" - this means it has skipped batch processing because it was told that generated IDs are required immediately after insertion.

Generation strategies that typically do work are TABLE and SEQUENCE, because Hibernate can pre-generate the IDs, thereby allowing for batch insertion.

A quick way to spot whether your batch insertion works is to activate DEBUG-level logs as BatchingBatcher will explicitly tell you the batch size it's executing ("Executing batch size: " + batchSize ).

Additionally, the following properties are important for achieving batch insertion. I daren't say they're required as I'm not enough of a Hibernate-expert to do so - perhaps it's just my particular configuration - but in my experience they were needed nonetheless:

hibernate.order_inserts = true
hibernate.order_updates = true

These properties are pretty poorly documented, but I believe what they did was enable for the SQL INSERT and UPDATE statements to be properly grouped for batch execution; I think this might be the multi-row inserts you're after. Don't shoot me if I'm wrong on this, I'm recalling from memory.

I'll also go ahead and assume that you set the following property; if not, this should serve as a reminder:

hibernate.jdbc.batch_size = xx

Where xx is your desired batch size, naturally.

like image 165
voetsjoeba Avatar answered Oct 10 '22 10:10

voetsjoeba


The final solution for me was to use voetsjoeba's response as a jumping off point. My hibernate config uses the following options:

hibernate.order_inserts = true
hibernate.order_updates = true
  • I changed from using Session to StatelessSession

  • Re-ordered the Java code to process all the elements in a batch a table at a time. So all of table x, then table y, etc.

  • Removed the <generator> from each class. Java now creates it and assigns it to the object

  • Created logic that allowed me to determine if just an id was being set and not write 'empty' lines to the database

  • Finally, I turned on dynamic-insert for my classes in their hibernate definitions like so: <class name="com.my.class" table="MY_TABLE" dynamic-insert="true">

like image 7
ashurexm Avatar answered Oct 10 '22 10:10

ashurexm