When converting a project from Ibatis to JPA 2.1, I'm faced with a problem where I have to load a complete object graph for a set of objects, without hitting N+1 selects or using cartesian products for performance reasons.
A users query will yield a List<Task>, and I need to make sure that when I return the tasks, they have all properties populated, including parent, children, dependencies and properties. First let me explain the two entity objects involved.
A Task is part of a hierarchy. It can have a parent Task and it can also have children. A Task can be dependent on other tasks, expressed by the 'dependencies' property. A task can have many properties, expressed by the properties property.
The example objects have been simplified as much as possible and boilerplate code is removed.
@Entity
public class Task {
@Id
private Long id;
@ManyToOne(fetch = LAZY)
private Task parent;
@ManyToOne(fetch = LAZY)
private Task root;
@OneToMany(mappedBy = "task")
private List<TaskProperty> properties;
@ManyToMany
@JoinTable(name = "task_dependency", inverseJoinColumns = { @JoinColumn(name = "depends_on")})
private List<Task> dependencies;
@OneToMany(mappedBy = "parent")
private List<Task> children;
}
@Entity
public class TaskPropertyValue {
@Id
private Long id;
@ManyToOne(fetch = LAZY)
private Task task;
private String name;
private String value;
}
The Task hierarchy for a given task can be infinitely deep, so to make it easier to get the whole graph, a Task will have a pointer to it's root task via the 'root' property.
In Ibatis, I simply fetched all Tasks for the distinct list of root id's, and then did ad-hoc queries for all properties and dependencies with a "task_id IN ()" query. When I had those, I used Java code to add properties, children and dependencies to all model objects so that the graph was complete. For any size list of tasks, I would then only do 3 SQL queries, and I'm trying to do the same with JPA. Since the 'parent' property indicates where to add the children, I didn't even have to query for those.
I've tried different approaches, including:
One possible solution could be to create new Task objects that are not managed by JPA and sew my hierarchy together using those, and I guess I can live with that, but it doesn't feel very "JPA", and then I couldn't use JPA for what it's good at - tracking and persisting changes to my objects automatically.
Any hints would be greatly appreciated. I'm open to using vendor spesific extensions if necessary. I'm running in Wildfly 8.1.0.Final (Java EE7 Full Profile) with Hibernate 4.3.5.Final.
There are some strategies to achieve your goals:
sub-select fetching would load all lazy entities with an additional sub-select, the very first time you need a lazy association of that given type. This sounds appealing at first, but it makes your app fragile to the number of additional sub-select entities to fetch and may propagate to other service methods.
batch fetching is easier to control since you can enforce the number of entities to be loaded in one batch and might not affect too much other use cases.
using a recursive common table expression if your DB supports it.
In the end, it's all about what you plan on doing with the selected rows. If it's just about displaying them into a view, then a native query is more than enough.
If you need to retain the entities across multiple requests (first the view part, the second for the update part) then entities are a better approach.
From your response, I see you need to issue an EntityManager.merge()
and probably rely on cascading to propagate children's state transitions (add/remove).
Since we are talking about 3 JPA queries, and as long as you don't get a Cartesian Product then you should be fine with JPA.
You should strive for the minimum amount of queries but it doesn't mean you will always have to have one and only one query. Two or three queries are not an issue at all.
As long as you control the query number and don't get into an N+1 query issue] you are fine with more than one query too. Trading a Cartesian Product (2 one-to-many fetches) for one join and one additional select is a good deal anyway.
In the end, you should always check the EXPLAIN ANALYZE query plan and reinforce/rethink your strategy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With