I'm trying to understand why saveAll has better performance than save in the Spring Data repositories. I'm using CrudRepository
which can be seen here.
To test I created and added 10k entities, which just have an id and a random string (for the benchmark I kept the string a constant), to a list. Iterating over my list and calling .save
on each element, it took 40 seconds. Calling .saveAll
on the same entire list completed in 2 seconds. Calling .saveAll
with even 30k elements took 4 seconds. I made sure to truncate my table before performing each test. Even batching the .saveAll
calls to sublists of 50 took 10 seconds with 30k.
The simple .saveAll
with the entire list seems to be the fastest.
I tried to browse the Spring Data source code but this is the only thing I found of value. Here it seems .saveAll
simply iterates over the entire Iterable
and calls .save
on each one like I was doing. So how is it that much faster? Is it doing some transactional batching internally?
saveAll(bookList); In our tests, we noticed that the first method took around 2 seconds, and the second one took approximately 0.3 seconds. Furthermore, when we enabled JPA Batch Inserts, we observed a decrease of up to 10% in the performance of the save() method, and an increase of up to 60% on the saveAll() method.
I rely on Spring Data JPA to handle all transactions and I noticed a huge speed difference between this and my old configuration. Storing around 1000 elements was being done in around 6s and now it's taking over 25 seconds.
yes you can mix both new and existing entities in the passes list. One strange issue with this is that saveAll can't accurately determine whether the entity is new or not.
Without having your code, I have to guess, I believe it has to do with the overhead of creating new transaction for each object saved in the case of save
versus opening one transaction in the case of saveAll
.
Notice the definition of save
and saveAll
they are both annotated with @Transactional
. If your project is configured properly, which seems to be the case since entities are being saved to the database, that means a transaction will be created whenever one of these methods are called. if you are calling save
in a loop that means a new transaction is being created each time you call save
, but in the case of saveAll
there is one call and therefor one transaction created regardless of the number of entities being saved.
I'm assuming that the test is not itself being run within a transaction, if it were to be run within a transaction then all calls to save will run within that transaction since the the default transaction propagation is Propagation.REQUIRED
, that means if there is a transaction already open the calls will be run within it. If your planning to use spring data I strongly recommend that you read about transaction management in Spring.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With