I have some code that performs an UPSERT, also known as a Merge. I want to clean-up this code, specifically, I want to move away from exception handling, and reduce overall verbosity and sheer complexity of the code for such a simple operation. The requirement is to insert each item unless it already exists:
public void batchInsert(IncomingItem[] items) {
try(Session session = sessionFactory.openSession()) {
batchInsert(session, items);
}
catch(PersistenceException e) {
if(e.getCause() instanceof ConstraintViolationException) {
logger.warn("attempting to recover from constraint violation");
DateTimeFormatter dbFormat = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS");
items = Arrays.stream(items).filter(item -> {
int n = db.queryForObject("select count(*) from rets where source = ? and systemid = ? and updtdate = ?::timestamp",
Integer.class,
item.getSource().name(), item.getSystemID(),
dbFormat.format(item.getUpdtDateObj()));
if(n != 0) {
logger.warn("REMOVED DUPLICATE: " +
item.getSource() + " " + item.getSystemID() + " " + item.getUpdtDate());
return false;
}
else {
return true; // keep
}
}).toArray(IncomingItem[]::new);
try(Session session = sessionFactory.openSession()) {
batchInsert(session, items);
}
}
}
}
An initial search of SO is unsatisfactory:
@SQLInsert
annotation; unfortunately, this solution is too error-prone to use on wider tables, and maintenance-intensive in evolving applications.merge()
which is ok when single-threadedIn the question How to do ON DUPLICATE KEY UPDATE in Spring Data JPA? which was marked as a duplicate, I noticed this intriguing comment:
That was a dead-end as I really don't understand the comment, despite it sounding like a clever solution, and mention of "actual same SQL statement".
Another promising approach is this: Hibernate and Spring modify query Before Submitting to DB
ON CONFLICT DO NOTHING / ON DUPLICATE KEY UPDATE
Both of the major open-source databases support a mechanism to push idempotency down to the database. The examples below use the PostgreSQL syntax, but can be easily adapted for MySQL.
By following the ideas in Hibernate and Spring modify query Before Submitting to DB, Hooking into Hibernate's query generation, and How I can configure StatementInspector in Hibernate?, I implemented:
import org.hibernate.resource.jdbc.spi.StatementInspector;
@SuppressWarnings("serial")
public class IdempotentInspector implements StatementInspector {
@Override
public String inspect(String sql) {
if(sql.startsWith("insert into rets")) {
sql += " ON CONFLICT DO NOTHING";
}
return sql;
}
}
with property
<prop key="hibernate.session_factory.statement_inspector">com.myapp.IdempotentInspector</prop>
Unfortunately this leads to the following error when a duplicate is encountered:
Caused by: org.springframework.orm.hibernate5.HibernateOptimisticLockingFailureException: Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1; nested exception is org.hibernate.StaleStateException: Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1
Which makes sense, if you think about what's going on under the covers: the ON CONFLICT DO NOTHING
causes zero rows to be inserted, but one insert is expected.
Is there a solution that enables thread-safe exception-free concurrent idempotent inserts and doesn't require manually defining the entire SQL insert statement to be executed by Hibernate?
For what it's worth, I feel that the approaches that push the dupcheck down to the database are the path to a proper solution.
CLARIFICATION
The IncomingItem
objects consumed by the batchInsert
method originate from a system where records are immutable. Under this special condition the ON CONFLICT DO NOTHING
behaves the same as an UPSERT, notwithstanding possible loss of the Nth update.
Short answer - Hibernate does not support it out of the box (as confirmed by a Hibernate guru in this blog post). Probably you could make it work to some extent in some scenarios with the mechanisms you already described, but just using native queries directly looks the most straightforward approach to me for this purpose.
Longer answer would be that it would be hard to support it considering all the aspects of Hibernate I guess, e.g.:
@Audit
-ed entities, are they created or updated, if updated what has changed?Even if Hibernate supported it in some way, I'm not sure I'd be using that feature if there were too many caveats to watch out and take into consideration.
So, the rule of thumb I follow is:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With