Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Improve insert performance massively

In my application I need to massively improve insert performance. Example: A file with about 21K records takes over 100 min to insert. There are reasons it can takes some time, like 20 min or so but over 100 min is just too long.

Data is inserted into 3 tables (many-to-many). Id's are generated from a sequence but I have already googled and set hibernate.id.new_generator_mappings = true and allocationSize + sequence increment to 1000.

Also the amount of data is not anything extraordinary at all, the file is 90 mb.

I have verified with visual vm that most of the time is spent in jdbc driver (postgresql) and hibernate. I think the issue is related to a unique constraint in the child table. The service layer makes a manual check (=SELECT) before inserting. If the record already exists, it reuses it instead of waiting for a constraint exception.

So to sum it up for the specific file there will be 1 insert per table (could be different but not for this file which is the ideal (fastest) case). That means total 60k inserts + 20k selects. Still over 100 min seems very long (yeah hardware counts and it is on a simple PC with 7200 rpm drive, no ssd or raid). However this is an improved version over a previous application (plain jdbc) on which the same insert on this hardware took about 15 min. Considering that in both cases about 4-5 min is spent on "pre-processing" the increase is massive.

Any tips who this could be improved? Is there any batch loading functionality?

like image 766
beginner_ Avatar asked Nov 13 '12 05:11

beginner_


People also ask

How can I improve my insert performance?

To optimize insert speed, combine many small operations into a single large operation. Ideally, you make a single connection, send the data for many new rows at once, and delay all index updates and consistency checking until the very end.

How increase SQL Server insert performance?

Drop Index before Insertion of Data We should drop the index before insertion of a large amount of data. This makes the insert statement run faster.

How can I speed up insert in Oracle?

One of the most common ways to improve the performance of an INSERT operation is to use the APPEND optimizer hint. APPEND forces the optimizer to perform a direct path INSERT and appends new values above the high water mark (the end of the table) while new blocks are being allocated.


1 Answers

see

spring-data JPA: manual commit transaction and restart new one

Add entityManager.flush() and entityManager.clear() after every n-th call to save() method. If you use hibernate add hibernate.jdbc.batch_size=100 which seems like a reasonable choice.

Performance increase was > 10x, probably close to 100x.

like image 114
beginner_ Avatar answered Jan 04 '23 07:01

beginner_