default_batch_fetch_size recommended values

Tags:

hibernate

I was going through some hibernate tutorials and got stuck on default_batch_fetch_size. Reading expert comments on "Can Hibernate be used in performance sensitive applications?" clearly explained the significance but I am trying to understand why are the recommended values 4, 8, 16 or 32 as used in the link.

Regards Tarun

314

asked Jan 16 '14 12:01

Tarun Bhatt

2 Answers

Summary:

When batch fetching is enabled, Hibernate prepare a lot of queries: those queries take a lot of memory which can't be garbaged. A batch size of 1000 will take like 150 Mo of RAM.

So, having a low general batch size (like 10, 20 or 40) is best, only set bigger batch size for specific collection with the @BatchSize annotations.

Detail:

Fetching batch size is explained here Understanding @BatchSize in Hibernate , "hibernate.default_batch_fetch_size" is the general parameter and the "@BatchSize" annotation allows to override the general parameter on a specific association.

But those explanations don't really answer the question "why the official doc recommends the values 4, 8 or 16"? Obviously, modern databases can handle queries with far more than 16 values in a IN clause, and doing queries with let say 1000 values in the IN clause will allow to do less queries and thus allow to have better performance... So why not setting 1000 as batchsize?

I did it, I put 1024 as batchsize, and the answer come up quickly: the tomcat server take much more time to start and in debug log I can see lot of line with "Static select for entity ...".

What happened is that Hibernate prepared thousands of static queries, here are part of the logs for an entity:

...
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?, ?, ?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?, ?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id in (?, ?)
Static select for entity Profile [PESSIMISTIC_READ]: select xxx_ with (holdlock, rowlock ) where id = ?
...

As you can see, Hibernate prepare the batch fetch requests, but not for all requests. Hibernate prepare all requests for 1,2,3....10 arguments, and then prepare only the requests with a number of args equals to batchSize/(2^n). Example, if batchSize=120 => 120, 60, 30, 15, 10, 9, 8, ..., 2, 1

So I tried to do a batch fetch of a collection with various number of elements, and results are:

For fetching 18 items, hibernate made 2 queries: one with 16 items and one with 2 items.
For fetching 16 items, hibernate made 1 query with 16 items.
For fetching 12 items, hibernate made 2 queries: one with 10 items and one with 2 items.

Hibernate only used the statement prepared at startup.

After that, I monitored the RAM usage of all this prepared statement:

with batchSize = 0 => 94 Mo (it's my reference)
batchSize = 32 => 156 Mo (+62 Mo with the reference)
batchSize = 64 => 164 Mo (+68 Mo with the reference)
batchSize = 1000 => 250 Mo (+156! Mo with the reference)

(my project is medium sized, about 300 entities)

It's now time for the conclusion:

1) The batchsize can have a big effect on startup time and memory consumption. It doesn't scale linearly with the batchsize, a batchsize of 80 will cost 2 times more than a batchsize of 10.

2) Hibernate can't retrieve collection of items with batch of any size, it only use the prepared batch queries. If you set batchSize=120, the prepared queries will be those with 120, 60, 30, 15, 10, 9, 8, 7, 6, 5, 4, 3, 2 and 1 arguments. So if you try to fetch a collection with 220 items, 4 queries will fired: the first will retrieve 120 item, the second 60, the third 30 and the fourth 10.

This explain why the recommended batchSizes are low. I will recommend to set a low global batchSize like 20 (20 seems better to me than 16 as it will not generate more prepared queries than 16) and to set a specific bigger @BatchSize only when needed.

(I used Hibernate 5.1)

answered Nov 09 '22 15:11

PCO

With respect to the memory / startup time concerns. Try with:

 <property name="hibernate.batch_fetch_style" value="dynamic" />

Only one prepared statement with "where id = ?", but the batch fetching of the entities of the same type in the session is dynamically constructed with the limit of hibernate.default_batch_fetch_size.

answered Nov 09 '22 14:11

PacoG

Related questions
                            
                                How to maintain Hibernate cache consistency running two Java applications?
                            
                                hibernate Lock wait timeout exceeded;
                            
                                The fastest way to check if some records in a database table?
                            
                                Spring-Hibernate DAO naming convention?
                            
                                Can't get JPA2 running with Hibernate and Maven
                            
                                JPA/Hibernate can't create Entity called Order
                            
                                JBoss and different versions of Hibernate
                            
                                hibernate - HQL joins on many clauses
                            
                                Hibernate 4: persisting InheritanceType.JOINED discriminator column values
                            
                                Grails: What is the difference between an unflushed session and a rolled back transaction?
                            
                                Spring Hibernate Lazy Fetch collections transactions not working
                            
                                NoClassDefFoundError: org/aopalliance/intercept/MethodInterceptor
                            
                                jpa criteriabuilder upper gives compilation error
                            
                                Spring Boot: Using a @Service in Quartz job execution
                            
                                Connection lost overnight (spring boot + mysql)
                            
                                JPA / Spring / Delete Entity, type Mismatch (int/long for id)
                            
                                Hibernate Search + Spring Boot: java.lang.IllegalStateException: No transactional EntityManager available
                            
                                How to use hibernate @DynamicUpdate with spring data jpa?
                            
                                Spring 3.1 + Hibernate 4.1 JPA, Entity manager factory is registered twice
                            
                                Finding items with a set containing all elements of a given set with jpql

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With