Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle a large set of data using Spring Data Repositories?

I have a large table that I'd like to access via a Spring Data Repository.

Currently, I'm trying to extend the PagingAndSortingRepository interface but it seems I can only define methods that return lists, eg.:

public interface MyRepository extends 
        PagingAndSortingRepository<MyEntity, Integer>
{
  @Query(value="SELECT * ...")
  List<MyEntity> myQuery(Pageable p);
}

On the other hand, the findAll() method that comes with PagingAndSortingRepository returns an Iterable (and I suppose that the data is not loaded into memory).

Is it possible to define custom queries that also return Iterable and/or don't load all the data into memory at once?

Are there any alternatives for handling large tables?

like image 429
José Ricardo Avatar asked Mar 05 '13 18:03

José Ricardo


People also ask

How are spring data repositories actually implemented?

In the repository interfaces, we can add the methods like findByCustomerNameAndPhone() (assuming customerName and phone are fields in the domain object). Then, Spring provides the implementation by implementing the above repository interface methods at runtime (during the application run).

Should I use JpaRepository or CrudRepository?

Crud Repository doesn't provide methods for implementing pagination and sorting. JpaRepository ties your repositories to the JPA persistence technology so it should be avoided. We should use CrudRepository or PagingAndSortingRepository depending on whether you need sorting and paging or not.

What is difference between PagingAndSortingRepository and JpaRepository?

PagingAndSortingRepository provides methods to do pagination and sort records. JpaRepository provides JPA related methods such as flushing the persistence context and delete records in a batch.

What is the hierarchy of repository in Spring data JPA?

Every repository which is mentioned above extends the generic Repository interface and they each are responsible for different functionality. Spring Data Hierarchy defines the Repository (marker interface) as the top-level interface.


2 Answers

We have the classical consulting answer here: it depends. As the implementation of the method is store specific, we depend on the underlying store API. In case of JPA there's no chance to provide streaming access as ….getResultList() returns a List. Hence we also expose the List to the client as especially JPA developers might be used to working with lists. So for JPA the only option is using the pagination API.

For a store like Neo4j we support the streaming access as the repositories return Iterable on CRUD methods as well as on the execution of finder methods.

like image 100
Oliver Drotbohm Avatar answered Jan 24 '23 20:01

Oliver Drotbohm


The implementation of findAll() simply loads the entire list of all entities into memory. Its Iterable return type doesn't imply that it implements some sort of database level cursor handling.

On the other hand your custom myQuery(Pageable) method will only load one page worth of entities, because the generated implementation honours its Pageable parameter. You can declare its return type either as Page or List. In the latter case you still receive the same (restricted) number of entities, but not the metadata that a Page would additionally carry.

So you basically did the right thing to avoid loading all entities into memory in your custom query.

Please review the related documentation here.

like image 39
zagyi Avatar answered Jan 24 '23 21:01

zagyi