I am using a MERGE
statement as an UPSERT
to either add a new record or update the current one. I have multiple threads driving the database through multiple connections and multiple statements (one connection and statement per thread). I am batching the statements 50 at a time.
I was very surprised to get a duplicate key
violation during my tests. I expected that to be impossible because the MERGE
will be performed as a single transaction, or is it?
My Java code looks like:
private void addBatch(Columns columns) throws SQLException {
try {
// Set parameters.
for (int i = 0; i < columns.size(); i++) {
Column c = columns.get(i);
// Column type is an `enum` with a `set` method appropriate to its type, e.g. setLong, setString etc.
c.getColumnType().set(statement, i + 1, c.getValue());
}
// Add the insert as a batch.
statement.addBatch();
// Ready to execute?
if (++batched >= MaxBatched) {
statement.executeBatch();
batched = 0;
}
} catch (SQLException e) {
log.warning("addBatch failed " + sql + " thread " + Thread.currentThread().getName(), e);
throw e;
}
}
The query looks like this:
MERGE INTO CustomerSpend AS T
USING ( SELECT ? AS ID, ? AS NetValue, ? AS VoidValue ) AS V
ON T.ID = V.ID
WHEN MATCHED THEN
UPDATE SET T.ID = V.ID, T.NetValue = T.NetValue + V.NetValue, T.VoidValue = T.VoidValue + V.VoidValue
WHEN NOT MATCHED THEN
INSERT ( ID,NetValue,VoidValue ) VALUES ( V.ID, V.NetValue, V.VoidValue );
The error reads:
java.sql.BatchUpdateException: Violation of PRIMARY KEY constraint 'PK_CustomerSpend'. Cannot insert duplicate key in object 'dbo.CustomerSpend'. The duplicate key value is (498288 ).
at net.sourceforge.jtds.jdbc.JtdsStatement.executeBatch(JtdsStatement.java:944)
at x.db.Db$BatchedStatement.addBatch(Db.java:299)
...
The key on the table is a PRIMARY
key on the ID
field.
MERGE is a deterministic statement. That is, you cannot update the same row of the target table multiple times in the same MERGE statement.
Starting with SQL Server 2008, you can use a MERGE statement to modify data in a target table based on data in a source table. The statement joins the target to the source by using a column common to both tables, such as a primary key.
The MERGE statement combines INSERT, UPDATE, and DELETE operations into a single statement, eliminating the need to write separate logic for each. It changes the data in a target table based on the data in a source table.
Merge is an atomic operation to update table (target) content using other table (source).
MERGE
is atomic meaning that either all changes are committed or all changes are rolled back.
It does not prevent duplicate keys in case of high concurrency. Adding holdlock
hint will take care of that.
MERGE INTO CustomerSpend WITH (HOLDLOCK) AS T
USING ( SELECT ? AS ID, ? AS NetValue, ? AS VoidValue ) AS V
ON T.ID = V.ID
WHEN MATCHED THEN
UPDATE SET T.ID = V.ID, T.NetValue = T.NetValue + V.NetValue, T.VoidValue = T.VoidValue + V.VoidValue
WHEN NOT MATCHED THEN
INSERT ( ID,NetValue,VoidValue ) VALUES ( V.ID, V.NetValue, V.VoidValue );
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With