Optimize JPA dynamic count query

Question

Having the typical method which returns a paginated result, using CriteriaBuilder and performing 2 queries:

one that counts the total number of results
and another one that gives us the subset for the specified page

We have noticed that the first query, JPA does not optimize it at all because it's using the exists (from Oracle).

Java code:

Root<Foo> from = criteriaQuery.from(Foo.class);
//... predicates
CriteriaQuery<Long> countQuery = criteriaBuilder.createQuery(Long.class)
        .select(criteriaBuilder.countDistinct(from))
        .where(predicates.toArray(new Predicate[predicates.size()]));
Long numberResults = entityManager.createQuery(countQuery).getSingleResult();

SQL generated query:

SELECT COUNT(t0.REFERENCE) 
FROM foo t0 
WHERE EXISTS (
  SELECT t1.REFERENCE 
  FROM foo t1 
  WHERE ((((t0.REFERENCE = t1.REFERENCE) AND (t0.VERSION_NUM = t1.VERSION_NUM)) AND (t0.ISSUER = t1.ISSUER)) AND (t1.REFERENCE LIKE ? AND (t1.VERSION_STATUS = ?)))
);

How do I avoid using the exists? Is there something wrong with the java code?

jccampanero · Accepted Answer

For different reasons, this issue and this related article enumerate some of them, EclipseLink uses EXISTS in the countDistinct operation implementation.

Although I can agree with you, be aware that the performance offered by EXISTS in Oracle is in fact very dependent of the use case, and it doesn't have to be poor. Please, consider review this mythical blog entry in the Tom Kyte blob.

So my advice is, please, keep using the generated code and corresponding SQL.

If you need or want to use a different approach, a perhaps more performant way of counting the records could be fetching the ids of the entities that match the provided predicates (the actual performance in fact is mostly dependent on these predicates in fact), and count the results in memory, with Java. I mean:

CriteriaBuilder cb = entityManager.getCriteriaBuilder();
// I assume reference is String here
CriteriaQuery<String> query = cb.createQuery(String.class);
Root<Foo> root = query.from(Foo.class);

query
  .select(root.get("reference"))
    .distinct(true)
  .where(predicates.toArray(new Predicate[predicates.size()]))
;
List<String> references = entityManager.createQuery(query).getResultList();
int count = references.size();

Although I think it is always not advisable, if the amount of data is not large, you could even fetch the results once from the database, and do the paging in memory with Java, it is straightforward using subList, for instance.

At a final word, AFAIK other JPA providers such as Hibernate implements count in a different way: if switching the JPA provider is an option you could try using it instead.

p3consulting · Answer

With or without EXISTS, the query plans are identical. The only optimisation would be to return COUNT() and the result in the same query, easy to do in SQL with "OVER()". But mapping the Foo.class on a view and adding a transient column to contain the count will complicate a lot of other parts of the application, and mapping the result of paginated queries on a new CountedFoo.class will also complicate the solution.

Optimize JPA dynamic count query

Tags:

oracle-database

java

sql

jpa

eclipselink

anat0lius

2 Answers

jccampanero

p3consulting

Recent Activity

Donate For Us

Optimize JPA dynamic count query

Tags:

oracle-database

java

sql

jpa

eclipselink

anat0lius

2 Answers

jccampanero

p3consulting

Related questions

Recent Activity

Donate For Us