Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimize JPA dynamic count query

Having the typical method which returns a paginated result, using CriteriaBuilder and performing 2 queries:

  • one that counts the total number of results
  • and another one that gives us the subset for the specified page

We have noticed that the first query, JPA does not optimize it at all because it's using the exists (from Oracle).

Java code:

Root<Foo> from = criteriaQuery.from(Foo.class);
//... predicates
CriteriaQuery<Long> countQuery = criteriaBuilder.createQuery(Long.class)
        .select(criteriaBuilder.countDistinct(from))
        .where(predicates.toArray(new Predicate[predicates.size()]));
Long numberResults = entityManager.createQuery(countQuery).getSingleResult();

SQL generated query:

SELECT COUNT(t0.REFERENCE) 
FROM foo t0 
WHERE EXISTS (
  SELECT t1.REFERENCE 
  FROM foo t1 
  WHERE ((((t0.REFERENCE = t1.REFERENCE) AND (t0.VERSION_NUM = t1.VERSION_NUM)) AND (t0.ISSUER = t1.ISSUER)) AND (t1.REFERENCE LIKE ? AND (t1.VERSION_STATUS = ?)))
);

How do I avoid using the exists? Is there something wrong with the java code?

like image 587
anat0lius Avatar asked Feb 05 '26 00:02

anat0lius


2 Answers

For different reasons, this issue and this related article enumerate some of them, EclipseLink uses EXISTS in the countDistinct operation implementation.

Although I can agree with you, be aware that the performance offered by EXISTS in Oracle is in fact very dependent of the use case, and it doesn't have to be poor. Please, consider review this mythical blog entry in the Tom Kyte blob.

So my advice is, please, keep using the generated code and corresponding SQL.

If you need or want to use a different approach, a perhaps more performant way of counting the records could be fetching the ids of the entities that match the provided predicates (the actual performance in fact is mostly dependent on these predicates in fact), and count the results in memory, with Java. I mean:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
// I assume reference is String here
CriteriaQuery<String> query = cb.createQuery(String.class);
Root<Foo> root = query.from(Foo.class);

query
  .select(root.get("reference"))
    .distinct(true)
  .where(predicates.toArray(new Predicate[predicates.size()]))
;
List<String> references = entityManager.createQuery(query).getResultList();
int count = references.size();

Although I think it is always not advisable, if the amount of data is not large, you could even fetch the results once from the database, and do the paging in memory with Java, it is straightforward using subList, for instance.

At a final word, AFAIK other JPA providers such as Hibernate implements count in a different way: if switching the JPA provider is an option you could try using it instead.

like image 136
jccampanero Avatar answered Feb 06 '26 14:02

jccampanero


With or without EXISTS, the query plans are identical. The only optimisation would be to return COUNT() and the result in the same query, easy to do in SQL with "OVER()". But mapping the Foo.class on a view and adding a transient column to contain the count will complicate a lot of other parts of the application, and mapping the result of paginated queries on a new CountedFoo.class will also complicate the solution.

like image 26
p3consulting Avatar answered Feb 06 '26 13:02

p3consulting



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!