Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How could a INNER/RIGHT/LEFT JOIN be 14x slower than a FULL JOIN?

I have a query that takes 2.5 seconds with FULL JOIN, and 40 seconds with INNER, RIGHT or LEFT JOIN.

Here is the query. The subquery (done twice) only takes 1.3 seconds on its own.

SELECT T1.[time], T1.Total, T1.rn, T2.[time], T2.Total, T2.rn
FROM
(
select [time], MAX(ComputedValue) as Total, row_number() over (order by [time]) as rn
FROM
(
    select SUBSTRING(CONVERT(CHAR(10), IntervalStartTime, 108), 0, 6) as [time], ComputedValue
    from LoadTestTransactionSample
    where LoadTestRunId=285
    and CounterName='Total Transactions' 
    and TransactionName='Export'
) foo
group by [time]
) T1
_____ JOIN
(
select [time], MAX(ComputedValue) as Total, row_number() over (order by [time]) as rn
FROM
(
    select SUBSTRING(CONVERT(CHAR(10), IntervalStartTime, 108), 0, 6) as [time], ComputedValue
    from LoadTestTransactionSample
    where LoadTestRunId=285
    and CounterName='Total Transactions' 
    and TransactionName='Export'
) foo
group by [time]
) T2
ON T1.rn = T2.rn - 1

The select SUBSTRING bit is just getting an HH:MM string out of a DateTime. LoadTestTransactionSample is actually a VIEW that joins across 8 tables. (FYI the database is a Visual Studio load test results store). Here are its (relevant) columns:

LoadTestRunId INT NOT NULL
CounterName NVARCHAR(255) NOT NULL
TransactionName NVARCHAR(64) NOT NULL
IntervalStartTime DATETIME NOT NULL
IntervalEndTime DATETIME NOT NULL
ComputedValue REAL

A FULL JOIN returns an extra unwanted row, so I do need to do a RIGHT JOIN to get the right answer.

I'm not really looking for a solution (I have one: pre-fetch the subquery into a table variable use SQL Server 2012 analytic function 'LAG', thanks @a1ex07), just some understanding as to what could possibly cause the extreme difference in performance between these join types.


EDIT: Here's the slow right join query plan and the fast full join query plan. They are too big to post a screenshot.

EDIT 2: Actually the query plans have the RIGHT JOIN at 45% and FULL JOIN at 55%, which turns out to be utterly inaccurate (in reality it ends up worse than 99%/1%). I guess this means I have to rely on the actual execution statistics.

EDIT 3: Statistics for the slow RIGHT JOIN:

(40 row(s) affected)
Table 'LoadTestPerformanceCounterCategory'. Scan count 0, logical reads 37556, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestPerformanceCounter'. Scan count 0, logical reads 176464, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestScenario'. Scan count 0, logical reads 176464, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestCase'. Scan count 0, logical reads 176464, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'WebLoadTestTransaction'. Scan count 0, logical reads 13411100, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestPerformanceCounterInstance'. Scan count 0, logical reads 36563718, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestPerformanceCounterSample'. Scan count 19721, logical reads 269657, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestRunInterval'. Scan count 41, logical reads 205, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
    CPU time = 36754 ms,  elapsed time = 36763 ms.

Statistics for the fast FULL JOIN:

(41 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestPerformanceCounterCategory'. Scan count 0, logical reads 1832, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestPerformanceCounter'. Scan count 0, logical reads 8608, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestScenario'. Scan count 0, logical reads 8608, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestCase'. Scan count 0, logical reads 8608, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'WebLoadTestTransaction'. Scan count 0, logical reads 654200, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestPerformanceCounterInstance'. Scan count 0, logical reads 1783596, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestPerformanceCounterSample'. Scan count 962, logical reads 13154, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'LoadTestRunInterval'. Scan count 2, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
    CPU time = 1950 ms,  elapsed time = 1944 ms.

The RIGHT JOIN is doing massively more reads and more scans than the FULL JOIN, despite an apparently similar query plan.

Is Worktable (in FULL JOIN) a hint? Is that a temp table?

Does this seem to suggest that the query optimizer is broken?

like image 779
agentnega Avatar asked Oct 21 '22 15:10

agentnega


2 Answers

OK, it turns out the answer is: bad db statistics. Very bad. As in, never been updated.

exec sp_updatestats; FTW.

[hides head in shame]

like image 135
agentnega Avatar answered Oct 27 '22 09:10

agentnega


This is Execution plan for similar queries. Tables are very small.

Execution Plan

Query 1:

  • Uses INNER JOIN.
  • Execution plan looks to be smaller.
  • However, takes 62% of the total time.

Query 2:

  • Uses FULL JOIN.
  • Execution plan looks to be big.
  • However, takes 38% of the total time.

Reason : INNER JOIN in my case is using HASH MATCH. And FULL JOIN is using NESTED LOOP. This is decided by SQL optimizer that which physical join must be used(however we can for it use other physical join). Check your execution plan , it will help.

like image 39
Ravi Singh Avatar answered Oct 27 '22 09:10

Ravi Singh