I'm currently trying to insert in batch many records (~2000) and Jooq's batchInsert is not doing what I want.
I'm transforming POJOs into UpdatableRecords and then I'm performing batchInsert which is executing insert for each record. So Jooq is doing ~2000 queries for each batch insert and it's killing database performance.
It's executing this code (jooq's batch insert):
for (int i = 0; i < records.length; i++) {
Configuration previous = ((AttachableInternal) records[i]).configuration();
try {
records[i].attach(local);
executeAction(i);
}
catch (QueryCollectorSignal e) {
Query query = e.getQuery();
String sql = e.getSQL();
// Aggregate executable queries by identical SQL
if (query.isExecutable()) {
List<Query> list = queries.get(sql);
if (list == null) {
list = new ArrayList<Query>();
queries.put(sql, list);
}
list.add(query);
}
}
finally {
records[i].attach(previous);
}
}
I could just do it like this (because Jooq is doing same thing internally):
records.forEach(UpdatableRecord::insert);
instead of:
jooq.batchInsert(records).execute();
How can I tell Jooq to create new records in batch mode? Should I transform records into bind queries and then call batchInsert? Any ideas? ;)
jOOQ's DSLContext.batchInsert()
creates one JDBC batch statement per set of consecutive records with identical generated SQL strings (the Javadoc doesn't formally define this, unfortunately).
This can turn into a problem when your records look like this:
+------+--------+--------+
| COL1 | COL2 | COL3 |
+------+--------+--------+
| 1* | {null} | {null} |
| 2* | B* | {null} |
| 3* | {null} | C* |
| 4* | D* | D* |
+------+--------+--------+
.. because in that case, the generated SQL strings will look like this:
INSERT INTO t (col1) VALUES (?);
INSERT INTO t (col1, col2) VALUES (?, ?);
INSERT INTO t (col1, col3) VALUES (?, ?);
INSERT INTO t (col1, col2, col3) VALUES (?, ?, ?);
The reason for this default behaviour is the fact that this is the only way to guarantee ... DEFAULT
behaviour. As in SQL DEFAULT
. I gave a rationale of this behaviour here.
With this in mind, and as each consecutive SQL string is different, the inserts unfortunately aren't batched as a single batch as you intended.
true
One way to enforce all INSERT
statements to be the same is to set all changed flags of each individula record to true
:
for (Record r : records)
r.changed(true);
Now, all SQL strings will be the same.
Loader
APIInstead of batching, you could import the data (and specify batch sizes there). For details, see the manual's section about importing records:
https://www.jooq.org/doc/latest/manual/sql-execution/importing/importing-records
Your usage of batchInsert()
is convenience that works when using TableRecords
. But of course, you can generate an INSERT
statement manually and batch the individual bind variables by using jOOQ's batch statement API:
https://www.jooq.org/doc/latest/manual/sql-execution/batch-execution
There are a couple of open issues regarding the DSLContext.batchInsert()
and similar API. The client side algorithm that generates SQL strings for each individual record is inefficient and might be changed in the future, relying on changed()
flags directly. Some relevant issues:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With