Most Hibernate associations support "fetch" parameter:
fetch="join|select"
with "select" being default value.
How to decide which one to use for which association?
I tried changing all from "select" to "join" application wide - number of generated queries decreased probably 10 times but performance stayed exactly the same (even become a tiny bit worse).
Thanks.
A "fetch" join allows associations or collections of values to be initialized along with their parent objects using a single select. This is particularly useful in the case of a collection. It effectively overrides the outer join and lazy declarations of the mapping file for associations and collections.
By default, Hibernate uses lazy select fetching for collections and lazy proxy fetching for single-valued associations. These defaults make sense for most associations in the majority of applications. If you set hibernate. default_batch_fetch_size , Hibernate will use the batch fetch optimization for lazy fetching.
The fetch attribute tells hibernate how to get the children. Select fetching: a second SELECT is used to retrieve the associated entity or collection. Unless you explicitly disable lazy fetching by specifying lazy="false", this second select will only be executed when you access the association.
Join is supposed to solve the n+1 problem. If you have 10 parents, each with 10 children, join will require one query and select will require 11 (one for the parents and one for the children of each parent). This may not be a big deal if the database is on the same server as the application or if the network is really fast, but if there is latency in each database call, it can add up. The join method is a little less efficient on the initial query because you're duplicating the parent columns in every row, but you only make one round-trip to the database.
Generally, if I know I'm going to need the children of all the parents, I go with join. If I'm only going to need the children of a few parents, I use select.
Select will fetch child items by issuing a new query to the database for them. Join will fetch child items by joining them into the parent's query. So that's why you're seeing similar performance, even with a drop in number of queries.
Select:
SELECT * FROM parent WHERE id=(whatever)
SELECT * FROM child WHERE id=(parent.child.id)
Join:
SELECT *
FROM parent
LEFT OUTER JOIN child ON parent.child.id=child.id
WHERE parent.id=(whatever)
As to when to use one over the other... Not entire sure. It likely depends on the database system. If one was always better than the other, I doubt they would bother to give you the option! If you're seeing similar performance for each, I wouldn't worry about it.
If the parent has lots of children and those children in turn have many other, then in this case the initial 'join' might choke the network. My suggestion is to use 'select' in this case to split the selects.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With