First I'll explain how I understood and use @BatchSize
:
@BatchSize
is made in order to load relations of objects in batch, making less SQL request to the database. This is specially usefull on LAZY @OneToMany
relations.
However it's even useful on LAZY @OneToOne
relation and @ManyToOne
: if you load a list of entities from the database and ask to load a lazyed @*ToOne
entity, it will load the entities by batch even if i just use a test that load the relation of the 1st entity of the list.
Note if some want to tests : This only show if the entities are not already loaded : for instance if you have a list of user with manager and list all users, when you will access to the manager, no request will be triggered since it's already loaded.
The only drawback that i see on that method is if you load a list of item from the database but only use a part of it. This is a post-filtering operation.
So let's get to the main point.
Let's assume that i make everything good to never do post-filtering-like operations even if it's makes me do native SQL queries or use DTO objects for multiselect criteria query and so on.
@BatchSize
every lazyed relations after having carefully think about using eager loading / join and finally choose a lazy relation ? @BatchSize
or can i think "the bigger the better" ? This would mean "is there any a limit of number in "IN" SQL operator that can make my request enough slower to not be worth anymore ? I use Postgres but if you have answers for others SGBD i'm interested too. @BatchSize
on a class isn't producing a lot of results. I still have to annotate every lazy relationships, did i miss something about it or is it useless ?EDIT : The point of my 3 is that i'm getting a different behaviour.
Let say i'm loading a list of entities of class "A" which has a LAZY OneToMany relationship to B. Now i want to print all creationDate of B. So i'm doing a classic 2 for loop.
I annotated B with BatchSize now :
PS : i'm not considering any related query to B that might fire to load B fields with fetch select/subselect.
EDIT 2 : i just found this post Why would I not use @BatchSize on every lazy loaded relationship? althought i googled and search on SO beforeposting my question, guess i didn't use the right words...
However i'm adding something different that might lead to a different answer : when i'm wondering about using BatchSize on every relations, it's after choosing if i want a eager loading, with join / select fetch or if i want lazy loading.
@BatchSize
is meant to be used with lazy associations.@BatchSize
on class level means that the specified batch size for the entity will be applied for all @*ToOne
lazy associations with that entity. See the example with the Person
entity in the documentation.The linked question/answers you provided are more concerned about the need for optimization and lazy loading in general. They apply here as well of course, but they are not related to batch loading only, which is just one of the possible approaches.
Another important thing relates to eager loading which is mentioned in the linked answers and which suggests that if a property is always used then you may get better performance by using eager loading. This is in general not true for collections and in many situations for to-one associations either.
For example, suppose you have the following entity for which bs
and cs
are always used when A
is used.
public class A {
@OneToMany
private Collection<B> bs;
@OneToMany
private Collection<C> cs;
}
Eagerly loading bs
and cs
obviously suffers from N+1 selects problem if you don't join them in a single query. But if you join them in a single query, for example like:
select a from A
left join fetch a.bs
left join fetch a.cs
then you create full Cartesian product between bs
and cs
and returning count(a.bs) x count(a.cs)
rows in the result set for each a
which are read one by one and assembled into the entities of A
and their collections of bs
and cs
.
Batch fetching would be very optimal in this situation, because you would first read A
s, then bs
and then cs
, resulting in more queries but with much less total amount of data that is transferred from the database. Also, the separate queries are much simpler than a big one with joins and are easier for database to execute and optimize.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With