Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ON DUPLICATE KEY UPDATE - precedence

When doing an INSERT INTO {tableA} SELECT a,b,c FROM {tableB} ON DUPLICATE KEY UPDATE x=y

What is the precedence on how the duplicate keys are evaluated? I assume that MySQL first checks to see if a tuple from tableB exists clashes with a unique/primary key in tableA. If the SELECT doesn't include a primary key, or if no other duplicate key exists, then each subsequent UNIQUE INDEX "group" is evaluated and the same checking is performed. But what happens if your tableB has multiple sets of unique, multi-column indexes? Are they evaluated top-to-bottom, as described by SHOW INDEXES FROM tableB ?

Here's my SHOW INDEXES FROM <table>:

Table,Non_unique,Key_name,Seq_in_index,Column_name,Collation
daily_metrics,0,PRIMARY,1,id,A
daily_metrics,0,unique_lineItem_creative_y_m_d,1,line_item_id,A
daily_metrics,0,unique_lineItem_creative_y_m_d,2,creative_id,A
daily_metrics,0,unique_lineItem_creative_y_m_d,3,year,A
daily_metrics,0,unique_lineItem_creative_y_m_d,4,month,A
...

Imagine there are additional sets of unique indexes similar to unique_lineItem_creative_y_m_d

The docs don't seem to illustrate this behavior. https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html

I also assume that the first matching unique index is used, if a match exists, without any attempt to match subsequent unique indexes that could match. In other words, the first unique index that matches is used, without regard for trying to find the best possible match across all indexes.

like image 308
Jesse Skrivseth Avatar asked Oct 31 '25 21:10

Jesse Skrivseth


1 Answers

You are correct: as soon as MySQL detects a duplicate in any UNIQUE index, it abandons the INSERT and does the update.

The order in which MySQL evaluates the unique indexes does not change the outcome. There's no such thing as a better match for some unique index over another. Because they're unique indexes, any combination of column values that shows up as a duplicate is enough to completely specify the row to be updated.

MySQL's query planner, hopefully, chooses the index that's least costly to evaluate. But, formally speaking, the index it uses for this purpose is unpredictable. This unpredictability is an attribute of declarative languages like SQL. MySQL can do its work any way that works, and it doesn't have to tell you. It can be hard for programmers to grasp, because we're used to procedural languages.

like image 158
O. Jones Avatar answered Nov 03 '25 12:11

O. Jones