Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cross apply (select top 1) much slower than row_number()

Using AdventureWorks, listed below are queries for For each Product get any 1 row of its associated SalesOrderDetail.

Using cross apply it takes 14000ms. The equivalent row_number version takes only 70ms (200x faster).

cross apply is also slower than a simple inner join of all Products and SalesOrderDetails which returns 121317 rows (vs 266 rows when limited with TOP 1).

I prefer the cross apply syntax for this kind of query because it's cleaner than the row_number version. But obviously the cross apply version is using a very inefficient execution plan and too slow to be usable.

It seems to me the query is not working as intended. It should not take 14 seconds to run this simple query. I've used cross apply in other cases and never encountered anything this slow. My question is: what about this particular query that is confusing the query optimizer? Is there any query hints that can be applied to help it use the optimal execution plan? As suggested by @pacreely I've added statistics for each query.

--CROSS APPLY ~14000ms
SELECT  P.ProductID
       ,P.Name
       ,P.ProductNumber
       ,P.Color
       ,SOD.SalesOrderID
       ,SOD.UnitPrice
       ,SOD.UnitPriceDiscount
       ,SOD.LineTotal
FROM    Production.Product P
        CROSS APPLY ( SELECT TOP 1
                                *
                      FROM      Sales.SalesOrderDetail S
                      WHERE     S.ProductID = P.ProductID ) SOD;

--ROW_NUMBER ~70ms
SELECT  *
FROM    ( SELECT    P.ProductID
                   ,P.Name
                   ,P.ProductNumber
                   ,P.Color
                   ,SOD.SalesOrderID
                   ,SOD.UnitPrice
                   ,SOD.UnitPriceDiscount
                   ,SOD.LineTotal
                   ,ROW_NUMBER() OVER ( PARTITION BY P.ProductID ORDER BY P.ProductID ) RowNum
          FROM      Production.Product P
                    INNER JOIN Sales.SalesOrderDetail SOD ON SOD.ProductID = P.ProductID ) X
WHERE   X.RowNum = 1;

--Simple INNER JOIN ~400ms (121317 rows)
SELECT  P.ProductID
       ,P.Name
       ,P.ProductNumber
       ,P.Color
       ,SOD.SalesOrderID
       ,SOD.UnitPrice
       ,SOD.UnitPriceDiscount
       ,SOD.LineTotal
FROM    Production.Product P
        INNER JOIN Sales.SalesOrderDetail SOD ON SOD.ProductID = P.ProductID;

And maybe related to this problem, cross apply without SalesOrderDetail.LineTotal is 10x faster.

--CROSS APPLY (Without LineTotal) ~1200ms 
SELECT  P.ProductID
       ,P.Name
       ,P.ProductNumber
       ,P.Color
       ,SOD.SalesOrderID
       ,SOD.SalesOrderDetailID
       ,SOD.CarrierTrackingNumber
       ,SOD.OrderQty
       ,SOD.ProductID
       ,SOD.SpecialOfferID
       ,SOD.UnitPrice
       ,SOD.UnitPriceDiscount
       ,SOD.rowguid
       ,SOD.ModifiedDate
FROM    Production.Product P
        CROSS APPLY ( SELECT TOP 1
                                *
                      FROM      Sales.SalesOrderDetail S
                      WHERE     S.ProductID = P.ProductID ) SOD;

enter image description here Execution plans

Cross apply statistics

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

(266 row(s) affected)
Table 'SalesOrderDetail'. Scan count 1, logical reads 363114, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Product'. Scan count 1, logical reads 15, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 15688 ms,  elapsed time = 16397 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

Row_number statistics:

SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

(266 row(s) affected)
Table 'Product'. Scan count 9, logical reads 40, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetail'. Scan count 9, logical reads 1371, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 360 ms,  elapsed time = 266 ms.
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
like image 376
Tedy Pranolo Avatar asked Feb 25 '17 04:02

Tedy Pranolo


Video Answer


1 Answers

Thanks for all of your suggestions. As suggested by pacreely I checked and found out that LineTotal was actually a computed column. So it make sense it would slow everything down as the calculation is repeated for each row. However as shown above even without LineTotal, it's still too slow. Which then lead me to remove all columns but the Id from the cross apply clause. Finally I added an inner join to retrieve all the needed column. This version of the cross apply query as fast as the row_number query

SELECT  P.ProductID
       ,P.Name
       ,P.ProductNumber
       ,P.Color
       ,SOD.SalesOrderID
       ,SOD.UnitPrice
       ,SOD.UnitPriceDiscount
       ,SOD.LineTotal
FROM    Production.Product P
        CROSS APPLY ( SELECT TOP 1
                                S.SalesOrderDetailID
                      FROM      Sales.SalesOrderDetail S
                      WHERE     S.ProductID = P.ProductID ) SODID
        INNER JOIN Sales.SalesOrderDetail SOD ON SOD.SalesOrderDetailID = SODID.SalesOrderDetailID
like image 200
Tedy Pranolo Avatar answered Oct 01 '22 03:10

Tedy Pranolo