Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there any difference between CROSS APPLY and OUTER APPLY when creating a cartesian product?

When creating a cartesian product between two tables, is there any difference between CROSS APPLY and OUTER APPLY?

This may seem like a silly question given that without a relationship expressed between the tables, the right-hand table can't fail to satisfy the relation, but I'm respectful of what I don't know.

When I look at the execution plans with a simple test setup, they're identical [two index seeks feeding into Nested Loops (Inner Join)], but simple test setups can be deceptive.

Here's an example of what I mean (SQL Fiddle). The setup:

CREATE TABLE dbo.First (
    Id      INT IDENTITY(1, 1) PRIMARY KEY,
    Name    NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
    INSERT INTO dbo.First (Name) VALUES ('First' + CONVERT(NVARCHAR(100), @n));
    SET @n = @n + 1;
END
GO
CREATE INDEX IX__First__Name ON dbo.First(Name);
GO
CREATE TABLE dbo.Second (
    Id      INT IDENTITY(1, 1) PRIMARY KEY,
    Name    NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
    INSERT INTO dbo.Second (Name) VALUES ('Second' + CONVERT(NVARCHAR(100), @n));
    SET @n = @n + 1;
END
GO
CREATE INDEX IX__Second__Name ON dbo.Second(Name);
GO

Using CROSS APPLY:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('Second6543', 'Second517');

Using OUTER APPLY:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second                                 -- <== Only change is here
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('Second6543', 'Second517');

...both of which give me the expected four rows.

Plus various variations where either, or both, IN clauses return no matches:

-- No match in First
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('Second6543', 'Second517');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('Second6543', 'Second517');

-- No match in Second
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('no match');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('no match');

-- No match in either
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('no match');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('no match');

...all of which give me the expected zero rows.

like image 422
T.J. Crowder Avatar asked Jan 26 '18 11:01

T.J. Crowder


2 Answers

The difference comes into play when applied table or table-valued function has no records:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
WHERE       First.Name IN ('First253', 'First3304');

2 rows returned


SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
WHERE       First.Name IN ('First253', 'First3304');

0 rows returned

In OP's own words:

Not the way you're doing it, because conceptually you're filtering with WHERE after the APPLY (although the plans show the engine optimizing by doing it first); but if you explicitly filter first and then APPLY like this:

SELECT      First.Id AS FirstId, FilteredSecond.Id AS SecondId
FROM        First
CROSS APPLY (SELECT Id FROM Second WHERE Name IN ('xxx')) FilteredSecond 
WHERE       First.Name IN ('First253', 'First3304');

you'd see the difference because you'd get rows with NULLs with the OUTER but no rows with the CROSS.

like image 57
Y.B. Avatar answered Sep 22 '22 02:09

Y.B.


Think of CROSS APPLY as related to an INNER JOIN and OUTER APPLY as related to a LEFT JOIN.

  • CROSS / INNER will limit the result to rows from both sources, while
  • OUTER / LEFT will return all rows of the first table and NULLs in case of no related row in the second source.

The difference is that JOINs link two resultset over a relation condition, while APPLY is called row-wise with values of a current row.

You can use APPLY to create computed values out of row values or (which is the main purpose) to call a TVF with row-values as parameters. Often you can see APPLY in connection with XMLTypedVariable.nodes().

Some thoughts about execution In simple cases (like above) the engine will decide to walk the same paths. But with more complex scenarios the differences can be huge.

like image 31
Shnugo Avatar answered Sep 21 '22 02:09

Shnugo