Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Many-column UPDATE-JOIN with many-ISNULL takes a long time?

We have a stored proceudre in our database that updates a table by joining 2 tables on 30 columns with a where condition. The SQL Is in the general format of:

UPDATE Target
SET col1 = Source.col1
INNER JOIN Source 
on 
ISNULL(Target.Col2, '') = ISNULL(Source.Col2, '') and
ISNULL(Target.Col3, '') = ISNULL(Source.Col3, '') and
.
.
.
ISNULL(Target.Col31, '') = ISNULL(Source.Col31, '') and 

Here's the Query Plan. Save it to your PC and reopen it so it scales better.

enter image description here

The Source table has 65M records, the Target 165M. Previously it used to run in quite a few minutes. Considering how ugly and potentially inefficient the query is, I find this surprising. This month it ran for 1.5 hours, used 100% of the processor and we had to kill it.

Any suggestions how to improvise the below query and make it run on time..?

We have single column Indexes on a few of the columns used in the 30-col join condition.

I know the ISNULL function and the join on 30 columns is nuts and this is a bad design. Don't blame me, I inherited this economy.

Unfortunately there is no time for a re-design. Any suggestions?

like image 874
Chad Avatar asked May 02 '12 21:05

Chad


1 Answers

  1. Please post a screenshot of the estimated execution plan
  2. I suspect that previously the query used a hash join (which it should) but somehow cardinality estimation has it wrong now and you get a loop join. Slap a hash join hint on the query to see if it fixed this (INNER HASH JOIN). Once we have the exact plan we can say more.
  3. Change the equality to (A1 = A2 OR (A1 IS NULL AND A2 IS NULL)). SQL Server actually recognizes this pattern and converts it to an "exact equals without stupid null semantics" internally. You can seek indexes that way even with null values.

If this doesn't help, be sure to do step (3) and create a covering index on col2-col31 including col1. This will get you a merge join which is the most efficient plan possible in this case. It is really fast. Warning: This will double the on-disk size of the table and slow down updates.

like image 118
usr Avatar answered Nov 07 '22 23:11

usr