Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should I use CROSS APPLY over INNER JOIN?

What is the main purpose of using CROSS APPLY?

I have read (vaguely, through posts on the Internet) that cross apply can be more efficient when selecting over large data sets if you are partitioning. (Paging comes to mind)

I also know that CROSS APPLY doesn't require a UDF as the right-table.

In most INNER JOIN queries (one-to-many relationships), I could rewrite them to use CROSS APPLY, but they always give me equivalent execution plans.

Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?


Edit:

Here's a trivial example, where the execution plans are exactly the same. (Show me one where they differ and where cross apply is faster/more efficient)

create table Company (     companyId int identity(1,1) ,   companyName varchar(100) ,   zipcode varchar(10)  ,   constraint PK_Company primary key (companyId) ) GO  create table Person (     personId int identity(1,1) ,   personName varchar(100) ,   companyId int ,   constraint FK_Person_CompanyId foreign key (companyId) references dbo.Company(companyId) ,   constraint PK_Person primary key (personId) ) GO  insert Company select 'ABC Company', '19808' union select 'XYZ Company', '08534' union select '123 Company', '10016'   insert Person select 'Alan', 1 union select 'Bobby', 1 union select 'Chris', 1 union select 'Xavier', 2 union select 'Yoshi', 2 union select 'Zambrano', 2 union select 'Player 1', 3 union select 'Player 2', 3 union select 'Player 3', 3    /* using CROSS APPLY */ select * from Person p cross apply (     select *     from Company c     where p.companyid = c.companyId ) Czip  /* the equivalent query using INNER JOIN */ select * from Person p inner join Company c on p.companyid = c.companyId 
like image 664
Jeff Meatball Yang Avatar asked Jul 16 '09 17:07

Jeff Meatball Yang


People also ask

Is Cross apply better than inner join?

While most queries which employ CROSS APPLY can be rewritten using an INNER JOIN , CROSS APPLY can yield better execution plan and better performance, since it can limit the set being joined yet before the join occurs.

When to use cross Apply vs join?

In simple terms, a join relies on self-sufficient sets of data, i.e. sets should not depend on each other. On the other hand, CROSS APPLY is only based on one predefined set and can be used with another separately created set. A worked example should help with understanding this difference.

Is Cross apply same as inner join?

The CROSS APPLY operator is semantically similar to INNER JOIN operator. It retrieves those records from the table valued function and the table being joined, where it finds matching rows between the two.

When would you use a cross join example?

The CROSS JOIN is used to generate a paired combination of each row of the first table with each row of the second table. This join type is also known as cartesian join. Suppose that we are sitting in a coffee shop and we decide to order breakfast.


2 Answers

Can anyone give me a good example of when CROSS APPLY makes a difference in those cases where INNER JOIN will work as well?

See the article in my blog for detailed performance comparison:

  • INNER JOIN vs. CROSS APPLY

CROSS APPLY works better on things that have no simple JOIN condition.

This one selects 3 last records from t2 for each record from t1:

SELECT  t1.*, t2o.* FROM    t1 CROSS APPLY         (         SELECT  TOP 3 *         FROM    t2         WHERE   t2.t1_id = t1.id         ORDER BY                 t2.rank DESC         ) t2o 

It cannot be easily formulated with an INNER JOIN condition.

You could probably do something like that using CTE's and window function:

WITH    t2o AS         (         SELECT  t2.*, ROW_NUMBER() OVER (PARTITION BY t1_id ORDER BY rank) AS rn         FROM    t2         ) SELECT  t1.*, t2o.* FROM    t1 INNER JOIN         t2o ON      t2o.t1_id = t1.id         AND t2o.rn <= 3 

, but this is less readable and probably less efficient.

Update:

Just checked.

master is a table of about 20,000,000 records with a PRIMARY KEY on id.

This query:

WITH    q AS         (         SELECT  *, ROW_NUMBER() OVER (ORDER BY id) AS rn         FROM    master         ),         t AS          (         SELECT  1 AS id         UNION ALL         SELECT  2         ) SELECT  * FROM    t JOIN    q ON      q.rn <= t.id 

runs for almost 30 seconds, while this one:

WITH    t AS          (         SELECT  1 AS id         UNION ALL         SELECT  2         ) SELECT  * FROM    t CROSS APPLY         (         SELECT  TOP (t.id) m.*         FROM    master m         ORDER BY                 id         ) q 

is instant.

like image 86
Quassnoi Avatar answered Oct 10 '22 08:10

Quassnoi


Consider you have two tables.

MASTER TABLE

x------x--------------------x | Id   |        Name        | x------x--------------------x |  1   |          A         | |  2   |          B         | |  3   |          C         | x------x--------------------x 

DETAILS TABLE

x------x--------------------x-------x | Id   |      PERIOD        |   QTY | x------x--------------------x-------x |  1   |   2014-01-13       |   10  | |  1   |   2014-01-11       |   15  | |  1   |   2014-01-12       |   20  | |  2   |   2014-01-06       |   30  | |  2   |   2014-01-08       |   40  | x------x--------------------x-------x 

There are many situations where we need to replace INNER JOIN with CROSS APPLY.

1. Join two tables based on TOP n results

Consider if we need to select Id and Name from Master and last two dates for each Id from Details table.

SELECT M.ID,M.NAME,D.PERIOD,D.QTY FROM MASTER M INNER JOIN (     SELECT TOP 2 ID, PERIOD,QTY      FROM DETAILS D           ORDER BY CAST(PERIOD AS DATE)DESC )D ON M.ID=D.ID 
  • SQL FIDDLE

The above query generates the following result.

x------x---------x--------------x-------x |  Id  |   Name  |   PERIOD     |  QTY  | x------x---------x--------------x-------x |   1  |   A     | 2014-01-13   |  10   | |   1  |   A     | 2014-01-12   |  20   | x------x---------x--------------x-------x 

See, it generated results for last two dates with last two date's Id and then joined these records only in the outer query on Id, which is wrong. This should be returning both Ids 1 and 2 but it returned only 1 because 1 has the last two dates. To accomplish this, we need to use CROSS APPLY.

SELECT M.ID,M.NAME,D.PERIOD,D.QTY FROM MASTER M CROSS APPLY (     SELECT TOP 2 ID, PERIOD,QTY      FROM DETAILS D       WHERE M.ID=D.ID     ORDER BY CAST(PERIOD AS DATE)DESC )D 
  • SQL FIDDLE

and forms the following result.

x------x---------x--------------x-------x |  Id  |   Name  |   PERIOD     |  QTY  | x------x---------x--------------x-------x |   1  |   A     | 2014-01-13   |  10   | |   1  |   A     | 2014-01-12   |  20   | |   2  |   B     | 2014-01-08   |  40   | |   2  |   B     | 2014-01-06   |  30   | x------x---------x--------------x-------x 

Here's how it works. The query inside CROSS APPLY can reference the outer table, where INNER JOIN cannot do this (it throws compile error). When finding the last two dates, joining is done inside CROSS APPLY i.e., WHERE M.ID=D.ID.

2. When we need INNER JOIN functionality using functions.

CROSS APPLY can be used as a replacement with INNER JOIN when we need to get result from Master table and a function.

SELECT M.ID,M.NAME,C.PERIOD,C.QTY FROM MASTER M CROSS APPLY dbo.FnGetQty(M.ID) C 

And here is the function

CREATE FUNCTION FnGetQty  (        @Id INT  ) RETURNS TABLE  AS RETURN  (     SELECT ID,PERIOD,QTY      FROM DETAILS     WHERE ID=@Id ) 
  • SQL FIDDLE

which generated the following result

x------x---------x--------------x-------x |  Id  |   Name  |   PERIOD     |  QTY  | x------x---------x--------------x-------x |   1  |   A     | 2014-01-13   |  10   | |   1  |   A     | 2014-01-11   |  15   | |   1  |   A     | 2014-01-12   |  20   | |   2  |   B     | 2014-01-06   |  30   | |   2  |   B     | 2014-01-08   |  40   | x------x---------x--------------x-------x 

ADDITIONAL ADVANTAGE OF CROSS APPLY

APPLY can be used as a replacement for UNPIVOT. Either CROSS APPLY or OUTER APPLY can be used here, which are interchangeable.

Consider you have the below table(named MYTABLE).

x------x-------------x--------------x |  Id  |   FROMDATE  |   TODATE     | x------x-------------x--------------x |   1  |  2014-01-11 | 2014-01-13   |  |   1  |  2014-02-23 | 2014-02-27   |  |   2  |  2014-05-06 | 2014-05-30   |  |   3  |     NULL    |    NULL      | x------x-------------x--------------x 

The query is below.

SELECT DISTINCT ID,DATES FROM MYTABLE  CROSS APPLY(VALUES (FROMDATE),(TODATE)) COLUMNNAMES(DATES) 
  • SQL FIDDLE

which brings you the result

  x------x-------------x   | Id   |    DATES    |   x------x-------------x   |  1   |  2014-01-11 |   |  1   |  2014-01-13 |   |  1   |  2014-02-23 |   |  1   |  2014-02-27 |   |  2   |  2014-05-06 |   |  2   |  2014-05-30 |    |  3   |    NULL     |    x------x-------------x 
like image 33
Sarath KS Avatar answered Oct 10 '22 08:10

Sarath KS